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Preface 


This book is an introduction to Number Theory from a more geometric point 
of view than is usual for the subject, inspired by the idea that pictures are often a 
great aid to understanding. The title of the book, Topology of Numbers, is intended 
to express this visual slant, where we are using the term “Topology” with its general 
meaning of “the spatial arrangement and interlinking of the components of a system". 

The other unusual aspect of the book is that, rather than giving a broad introduc- 
tion to all the basic tools of Number Theory without going too deeply into any one, it 
focuses on a single topic, quadratic forms Q(x, y) = ax? + bxy + cy? with integer 
coefficients. Here there is a very rich theory that one can really immerse oneself into 
to get a deeper sense of the beauty and subtlety of Number Theory. Along the way 
we do in fact encounter many standard number-theoretic tools, with some context to 
show how useful they can be. 

A central geometric theme of the book is a certain two-dimensional figure known 
as the Farey diagram, discovered by Adolf Hurwitz in 1894, which displays certain 
relationships between rational numbers beyond just their usual distribution along the 
one-dimensional real number line. Among the many things the diagram elucidates 
that will be explored in the book are Pythagorean triples, the Euclidean algorithm, 
Pell’s equation, continued fractions, Farey sequences, and two-by-two matrices with 
integer entries and determinant +1. 

But most importantly for this book, the Farey diagram can be used to study 
quadratic forms Q(x, y) = ax? +bxy +cy° via John Conway’s marvelous idea of the 
topograph of such a form. The origins of the wonderfully subtle theory of quadratic 
forms can be traced back to ancient times. In the 1600s interest was reawakened by 
numerous discoveries of Fermat, but it was only in the period 1750-1800 that Euler, 
Lagrange, Legendre, and especially Gauss were able to uncover the main features of 
the theory. 

The principal goal of the book is to present an accessible introduction to this 
theory from a geometric viewpoint that complements the usual purely algebraic ap- 
proach. Prerequisites for reading the book are fairly minimal, hardly going beyond 
high school mathematics for the most part. One topic that often forms a significant 
part of elementary number theory courses is congruences modulo an integer n. It 
would be helpful if the reader has already seen and used these a little since we will not 
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develop congruence theory as a separate topic and will instead just use congruences 
as the need arises, proving whatever nontrivial facts are required including several 
of the basic ones that form part of a standard introductory number theory course. 
Among these is quadratic reciprocity, where we give Eisenstein’s classical proof since 
it involves some geometry. 

The high point of the basic theory of quadratic forms Q(x, y) is the class group 
first constructed by Gauss. This can be defined purely in terms of quadratic forms, 
which is how it was first presented, or by means of Kronecker’s notion of ideals intro- 
duced some 75 years after Gauss’s work. For subsequent developments and general- 
izations the viewpoint of ideals has proven to be central to all of modern algebra. In 
this book we present both approaches to the class group, first the older version just 
in terms of forms, then the later version using ideals. 


Here is how the book is organized. A preliminary Chapter 0 gives a sample of 
some of the sorts of questions studied in Number Theory, in particular motivating 
the study of quadratic forms by seeing how they arise in understanding Pythagorean 
triples, the integer side-lengths of right triangles, such as 3,4,5 and 5,12,13. 

After this introduction the next three chapters lay the groundwork for our ap- 
proach to quadratic forms by introducing the Farey diagram and its first applications 
to visualizing the Euclidean algorithm and continued fractions, both finite and infinite. 

The next four chapters are the heart of the book. Chapter 4 introduces the to- 
pograph of a quadratic form, which displays all its values visually in a convenient 
and effective picture. A variety of examples are given illustrating different kinds of 
qualitative behavior of the topograph. As applications, topographs give efficient ways 
to compute the values of periodic and eventually periodic continued fractions, and to 
find all the integer solutions of Pell’s equation x° — dy? = +1. 

Chapter 5 develops the classification theory for quadratic forms ax*+bxy+cy" 
in terms of the discriminant b* — 4ac. There are only a finite number of essentially 
distinct forms of a given discriminant, and it is shown how to compute these. Forms 
with symmetry play a special role, and a fairly complete picture of these is developed. 

Chapter 6 turns to the fundamental representation problem, which is to find all 
the values a given form takes on, or in other words, to determine when an equation 
ax? +bxy+cy* =n has integer solutions. There are two central themes here: how 
the factorization of n into primes plays a key role, largely reducing the problem to 
the case that n itself is prime; and how congruences modulo the discriminant give 
useful criteria for solvability, particularly in the case of primes. 

Chapter 7 completes the basic theory by presenting Gauss’s discovery of a way to 
multiply forms of a given discriminant, refining the multiplication of the values of the 
forms. This leads to an explanation of the seemingly mysterious fact that while there 
is essentially only one form of a given discriminant that represents a given prime, 
there can be several different forms representing nonprimes. 
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Finally, the rather lengthy Chapter 8 goes in a different direction to give an ex- 
position of the alternative viewpoint toward quadratic forms by expanding the set of 
rational numbers to sets of numbers a+ byn with a and b rational. Here the deeper 
subtleties of quadratic forms are translated into subtleties with the factorization of 
such numbers into “primes” and the lack of uniqueness of such factorizations. In 
keeping with the viewpoint of the rest of the book, we strive to make this essentially 
algebraic theory as geometric as possible. 

At the end of the book there are several tables giving the key data for quadratic 
forms of small discriminant. 


This book will remain available online in electronic form for free downloading 
after it has been published in the traditional paper form. The web address where it 
can be found is 


http://www.math.cornell.edu/~ hatcher 
Also available here will be a list of corrections as well as possible revisions and addi- 


tions to the book. Readers are encouraged to send comments and corrections to the 
email address posted on the web page. 
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In this preliminary Chapter 0 we introduce by means of examples some of the 
main themes of Number Theory, particularly those that will be emphasized in the rest 
of the book. 


Pythagorean Triples 


Let us begin by considering right triangles whose sides all have integer lengths. 
The most familiar example is the (3,4,5) right triangle, but there are many others as 
well, such as the (5, 12,13) right triangle. Thus we are looking for triples (a,b,c) of 
positive integers such that a? + b° = c°. Such triples are called Pythagorean triples 
because of the connection with the Pythagorean Theorem. Our goal will be a formula 
that gives them all. The ancient Greeks knew such a formula, and even before the 
Greeks the ancient Babylonians must have known a lot about Pythagorean triples be- 
cause one of their clay tablets from nearly 4000 years ago has been found which gives a 
list of 15 different Pythagorean triples, the largest of whichis (12709, 13500, 18541). 
(Actually, the tablet only gives the numbers a and c from each triple (a,b,c) for 
some unknown reason, but it is easy to compute b from a and c.) 

There is an easy way to create infinitely many Pythagorean triples from a given 
one just by multiplying each of its three numbers by an arbitrary number n. For 
example, from (3,4,5) we get (6,8,10), (9,12,15), (12,16,20), and so on. This 
process produces right triangles that are all similar to each other, so in a sense they 
are not essentially different triples. In our search for Pythagorean triples there is 
thus no harm in restricting our attention to triples (a,b,c) whose three numbers 
have no common factor. Such triples are called primitive. The large Babylonian triple 
mentioned above is primitive, since the prime factorization of 13500 is 273°5° but 
the other two numbers in the triple are not divisible by 2, 3, or 5. 

A fact worth noting in passing is that if two of the three numbers in a Pythagorean 
triple (a,b,c) have a common factor n, then n is also a factor of the third number. 
This follows easily from the equation a? + b? = c°, since for example if n divides a 
and b, then nê divides a° and b°, so n? divides their sum c°, hence n divides c. 
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Another case is that n divides a and c. Then n? divides a° and c*, so n? divides 
their difference c? -a° = b?, hence n divides b. In the remaining case that n divides 
b and c the argument is similar. 

A consequence of this divisibility factis that primitive Pythagorean triples can also 
be characterized as the ones for which no two of the three numbers have a common 
factor. 

If (a,b,c) is a Pythagorean triple, then we can divide the equation a? + b? = c? 
by c? to get an equivalent equation (4/.)° + (¥/.)° = 1. This equation is saying 
that the point (x, y) = (%,%/) is on the unit circle x? + y* = 1 in the xy-plane. 
The coordinates 4% and by, are rational numbers, so each Pythagorean triple gives a 
rational point on the circle, a point whose coordinates are both rational. Notice that 
multiplying each of a, b, and c by the same nonzero integer n yields the same point 
(x,y) on the circle. Going in the other direction, given a rational point on the circle, 
we can find a common denominator for its two coordinates so that it has the form 
(4/.,¥/.) and hence gives a Pythagorean triple (a,b,c). We can assume this triple is 
primitive by canceling any common factor of a, b, and c, and this does not change 
the point (4/,2/.). The two fractions 7/, and ¥/. must then be in lowest terms since 
we observed earlier that if two of a, b, c have a common factor, then all three have 
a common factor. 


From the preceding observations we can conclude that the problem of finding 
all Pythagorean triples is equivalent to finding all rational points on the unit circle 
x? + y? = 1. More specifically, there is an exact one-to-one correspondence between 
primitive Pythagorean triples and rational points on the unit circle that lie in the 
interior of the first quadrant (since we want all of a,b,c,x,y to be positive). 

In order to find all the rational points on the circle x° + y? = 1 we will use 
a construction that starts with one rational point and creates many more rational 
points from this one starting point. The four obvious rational points on the circle are 
the intersections of the circle with the coordinate axes, which are the points (+1,0) 
and (0,+1). It does not matter which one we choose as the starting point, so let 
us choose (0,1). Now consider a line which 
intersects the circle in this point (0,1) and (0,1) 
some other point P, as in the figure at the 
right. If the line has slope m, its equation 
will be y = mx + 1. If we denote the point (r,0) 
where the line intersects the x-axis by (7,0), 
then m = —//, so the equation for the line 
can be rewritten as y = 1 — */,. Here we 
assume r is nonzero since r = 0 corresponds to the slope m being infinite and the 
point P being (0, —1), arational point we already know about. To find the coordinates 
of the point P in terms of r when r + 0 we substitute y = 1 — */, into the equation 
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x? + y* = 1 and solve for x: 


2 
x?+(1-%) =1 
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We are assuming P + (0,—1) so x + 0 and we can cancel an x from both sides of the 
last equation above and then solve for x to get x = 27/,2,,. Plugging this into the 
formula y = 1 - */ gives y = 1-24-24, =1°-'/4-241. Thus the coordinates (x, y) 
of the point P are given by: 


(x,y) = 


2r oe 
r2 +1’ 1) 

Note that in these formulas we no longer have to exclude the value r = 0, which just 
gives the point (0, —1). Observe also that if we let r approach +o then the point P 
approaches (0,1), as we can see either from the picture or from the formulas. 

If r is a rational number, then the formula for (x,y) shows that both x and y 
are rational, so we have a rational point on the circle. Conversely, if both coordinates 
x and y of the point P on the circle are rational, then the slope m of the line must 
be rational, hence r must also be rational since r = — t/m. We could also solve the 
equation y = 1 — */; for r to get r = */,_y, showing again that r will be rational if 
x and y are rational (and y is not 1). The conclusion of all this is that, starting from 
the initial rational point (0,1) we have found formulas that give all the other rational 
points on the circle. 

Since there are infinitely many different choices for the rational number r , there 
are infinitely many rational points on the circle. But we can say something much 
stronger than this: every arc of the circle, no matter how small, contains infinitely 
many rational points. This is because every arc on the circle corresponds to an interval 
of r-values on the x-axis, and every interval in the x-axis contains infinitely many 
rational numbers. Since every arc on the circle contains infinitely many rational points, 
we can Say that the rational points are dense in the circle, meaning that for every point 
on the circle there is an infinite sequence of rational points approaching the given 
point. 

Now we can go back and find formulas for Pythagorean triples. If we set the 
rational number r equal to P/g with p and q integers having no common factor, 
then the formulas for x and y become: 


2(P/q) 2pq 


ENG 2 (Pq) -1_ p-e 
Pay +1 Poka 
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These formulas give the ratios x = % and y = ¥% for all Pythagorean triples 
(a,b,c), so they determine all Pythagorean triples up to multiplication by a constant. 
The simplest way to realize the ratios %/, = ?P4/p24q2 and 94% = P?-4%/p2+q2 is 
just to take: 

(a,b,c) = (2pq, p° - 8°, p° + 4°) 


The Pythagorean triples given by this formula may not be primitive, however. For 
example, if x and y are both odd then p°- q° and p* +q° are both even, as is 2pq, 
so the triple could be simplified by dividing by 2. The nonprimitive triples obtained 
in this way are the starred entries in the table below. 


(p,q) (x,y) (a,b,c) 

(2,1) (4/5,3/5) (4,3,5) 

(3,1)* (6/10,8/10)* (6,8,10)* — (3,4,5) 

(3,2) (12/13,5/13) (12,5,13) 

(4,1) (8/17,15/17) (8,15,17) 

(4,3) (24/25,7/25) (24,7,25) 

(5,1)* (10/26,24/26)* (10, 24, 26)* — (5,12,13) 
(5,2) (20/29, 21/29) (20, 21,29) 

(5,3)* (30/34, 16/34)* (30, 16, 34)* — (15, 8,17) 
(5,4) (40/41, 9/41) (40,9, 41) 

(6,1) (12/37, 35/37) (12,35, 37) 

(6, 5) (60/61, 11/61) (60, 11,61) 

(7,1)* (14/50, 48/50)* (14, 48,50)* — (7,24, 25) 
(7,2) (28/53, 45/53) (28,45, 53) 

(7,3)* (42/58, 40/58)* (42,40, 58)* — (21, 20, 29) 
(7,4) (56/65, 33/65) (56, 33,65) 

(7,5)* (70/74,24/74)* (70, 24, 74)* — (35,12,37) 
(7,6) (84/85, 13/85) (84, 13,85) 


Notice that the primitive versions of the starred triples occur higher in the table, but 
with a and b switched. This is a general phenomenon, as we will see in the course of 
proving the following basic result: 


Proposition. All primitive Pythagorean triples (a,b,c), after perhaps interchang- 
ing a and b, are obtained from the formula (a,b,c) = (2pq, p° — 4°, p? + qê) by 
letting p and q range over all positive integers with p > q, such that p and q 
have no common factor and are of opposite parity (one even and the other odd). 


Proof: We have seen that the formula (a,b,c) = (2pq,p* — q*,p* + qê) yields all 
Pythagorean triples up to multiplication by a constant, so we just need to investigate 
when the formula gives a primitive triple and what to do when it gives a nonprimitive 
triple. As before we can assume that p and q have no common divisor, and we can 
assume that p > q in order for the middle coordinate b = p° — q? to be positive. 


Case 1: Suppose p and q have opposite parity. If all three of 2pq, p°? — q°, and 
p° +q° have a common divisor d > 1 then d would have to be odd since p° — q* and 
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p? +q° are odd when p and q have opposite parity. Furthermore, since d is a divisor 
of both p° — q* and p° + q° it must divide their sum (p° + q*) + (p° — q*) = 2p’ 
and their difference (p° + q*) — (p°? — q?) = 2q*. However, since d is odd it would 
then have to divide p° and q’, forcing p and q to have a common factor (since any 
prime factor of d would have to divide p and q). This contradicts the assumption 
that p and q have no common factors, so we conclude that (2pq, p° — q*,p* +°) 
is primitive if p and q have opposite parity. 

Case 2: Suppose p and q have the same parity. Then their sum and difference are 
both even and we can write p + q = 2P and p —q = 2Q for some integers P and Q. 
Any common factor of P and Q would have to divide P+Q = Yo(p+q)+4/(p—q) = p 
and P - Q = 4(p + q) — Yo(p — q) = q, so P and Q have no common factors. In 
terms of P and Q our Pythagorean triple becomes: 


(a,b,c) = (2pq,p* - q°, p° + 4°) 
= (2(P + Q)(P - Q), (P +Q} - (P- Q)*, (P + Q)* + (P- Q)*) 
= (2(P* — Q*), 4PQ, 2(P* + Q*)) 
= 2(P* -= Q°,2PQ,P* + Q?) 
Canceling the factor of 2 in front of this last expression gives a new Pythagorean triple 
(P? —Q*,2PQ, P* + Q”) of the same type (2pq, p° — q*, p? +q?) that we started with 
but with the first two coordinates switched. This new triple is primitive by Case 1 


since P and Q cannot have the same parity, otherwise p = P+Q and q=P-Q 
would both be even, which is impossible since they have no common factor. 


From Cases 1 and 2 we can conclude that if we allow ourselves to switch the first 
two coordinates, then we get all primitive Pythagorean triples from the formula by 
restricting p and q to be of opposite parity and have no common factors. oO 


Pythagorean Triples and Quadratic Forms 


There are many questions one can ask about Pythagorean triples (a,b,c). For 
example, we could begin by asking which numbers actually arise as the numbers a, 
b, or c in some Pythagorean triple. It is sufficient to answer the question just for 
primitive Pythagorean triples, since the remaining ones are obtained by multiplying 
by arbitrary positive integers. We know all primitive Pythagorean triples arise from 
the formula 

(a,b,c) = (2pq, p° - 8°, p° + 4°) 


where p and q have no common factor and are of opposite parity. The latter condition 
just amounts to saying p and q are not both odd since they cannot both be even if 
they have no common factor. Determining whether a given number can be expressed 
in one of the forms 2pq, p° —q?, or p° + q° is a special case of the general question 
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of deciding when an equation Ap* + Bpq + Cq? = n has an integer solution p, q, for 
given integers A, B, C, and n. Expressions of the form Ax? + Bxy + Cy” are called 
quadratic forms. These will be the main topic studied in Chapters 4-8, where we will 
develop some general theory addressing the question of what values a quadratic form 
takes on when all the numbers involved are integers. For now, let us just look at the 
special cases at hand. 


First let us consider which numbers occur as a or b in primitive Pythagorean 
triples (a,b,c). A trivial case is the equation 0° + 1° = 1? which shows that 0 and 
1 can be realized by the triple (0,1, 1) which is primitive, so let us focus on realizing 
numbers bigger than 1. If we look at the earlier table of Pythagorean triples we see 
that all the numbers up to 15 can be realized as a or b in primitive triples except for 
2,6, 10, and 14. This might lead us to guess that the numbers realizable as a or b in 
primitive Pythagorean triples are the numbers not of the form 4k + 2. This is indeed 
true, and can be proved as follows. First note that since 2pq is even, p° — q? must 
be odd, otherwise both a and b would be even, violating primitivity. Now, every odd 
number is expressible in the form p°- q? since 2k + 1 = (k+1)*—k’, so in fact every 
odd number is the difference between two consecutive squares. Taking p = k+1 and 
q =k yields a primitive triple since k and k + 1 always have opposite parity and no 
common factors. This takes care of realizing odd numbers. For even numbers, they 
would have to be expressible as 2pq with p and q of opposite parity, which forces 
pq to be even so 2pq is a multiple of 4 and hence cannot be of the form 4k + 2. On 
the other hand, if we take p = 2k and q = 1 then 2pq = 4k with p and q having 
opposite parity and no common factors. 

To summarize, we have shown that all positive numbers 2k+1 and 4k occuras a 
or b in primitive Pythagorean triples but none of the numbers 4k + 2 occur. To finish 
the story, note that a number a = 4k +2 which cannot be realized in a primitive triple 
can be realized by a nonprimitive triple just by taking a triple (a,b,c) with a = 2k+1 
and doubling each of a, b, and c. Thus all numbers can be realized as a or b in 
Pythagorean triples (a,b,c). 


Now let us ask which numbers c can occur in Pythagorean triples (a,b,c), so we 
are trying to find a solution of p° + q° = c for a given number c. Pythagorean triples 
(p,q,r) give solutions when c is equal to a square r°, but we are asking now about 
arbitrary numbers c. It suffices to figure out which numbers c occur in primitive 
triples (a,b,c), since by multiplying the numbers c in primitive triples by arbitrary 
numbers we get the numbers c in arbitrary triples. A look at the earlier table shows 
that the numbers c that can be realized by primitive triples (a,b,c) seem to be fairly 
rare: only 5,13,17, 25, 29,37,41,53,61,65, and 85 occur in the table. These are all 
odd, and in fact they are all of the form 4k + 1. This always has to be true because 
p and q are of opposite parity, so one is an even number 2k and the other an odd 
number 21+ 1. Squaring, we get (2k)? = 4k? and (21+ 1)* = 4(l? +1) +1. Thus the 
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square of an even number has the form 4u and the square of an odd number has the 
form 4v +1. Hence p° + q* has the form 4(u + v) + 1, or more simply, just 4k + 1. 


The argument we just gave can be expressed more concisely using congruences 
modulo 4. We will assume the reader has seen something about congruences before, 
but to recall the terminology: two numbers a and b are said to be congruent modulo a 
number n if their difference a—b is a multiple of n. When n is negative, congruence 
modulo n is equivalent to congruence modulo |n|, so there is no loss of generality 
in restricting attention just to congruence modulo positive numbers. Congruence 
modulo 0 is the same as equality, so there is little reason to consider this case. One 
writes a = b mod n to mean that a is congruent to b modulo n, with the word 
“modulo” abbreviated to “mod”. One can tell whether two numbers are congruent 
mod n by dividing each of them by n and checking whether the remainders, which 
lie between 0 and n — 1, are equal. Every number is congruent mod n to one of the 
numbers 0,1, 2,---,n—1, and no two of these numbers are congruent to each other, 
so there are exactly n congruence classes of numbers mod n, where a congruence 
class means all the numbers congruent to a given number. In the preceding paragraph 
we were in effect dealing with congruence classes mod 4 and we saw that the square 
of an even number is congruent to 0 mod 4 while the square of an odd number is 
congruent to 1 mod 4, hence p° + q’ is congruent to 0+ 1 or 1+0 mod 4 when p 
and q have opposite parity, so p° + q* = 1 mod 4. 

Returning to the question of which numbers occur as c in primitive Pythagorean 
triples (a, b,c), we have seen that c = 1 mod 4, but looking again at the list 5,13,17, 
25,29, 37,41, 53,61,65, 85 we can observe the more interesting fact that most of these 
numbers are primes, and the ones that are not primes are products of earlier primes 
in the list! 25 = 5-5, 65 = 5-13, 85 = 5-17. From this somewhat slim evidence 
one might conjecture that the numbers c occurring in primitive Pythagorean triples 
are exactly the numbers that are products of primes congruent to 1 mod 4. The first 
prime satisfying this condition that is not on the original list is 73, and this is realized 
as p°? +q* = 8° + 3° in the triple (48,55, 73). The next two primes congruent to 1 
mod 4 are 89 = 8° +5° and 97 = 9° + 4°, so the conjecture continues to look good. As 
further evidence for the conjecture, numbers congruent to 1 mod 4 that are not on 
the list such as 9 = 3-3, 21 = 3-7, 33 = 3-11, 45 = 3°-5, 49 = 7-7, and 57 = 3-19 
each have a prime factor that is not congruent to 1 mod 4. 

More generally, if we ask which numbers can be expressed as p° +q? for integers 
p and q having no common divisor without requiring them to have opposite parity, 
then we will also get the numbers c in the starred entries of the earlier table. As we 
saw in the proof of the proposition about Pythagorean triples, these values of c are 
just the doubles of the values of c in primitive Pythagorean triples. Thus one can 
conjecture that the numbers expressible as p° + q? for positive integers p and q 
having no common divisor are the products of primes congruent to 1 mod 4 and the 
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doubles of these products. This conjecture is correct, but proving it is not easy. We 
will do this in Chapter 6. 

After this it is easy to go the last step and ask which numbers are sums p° + q° 
for arbitrary positive integers p and q. Now we are free to multiply p and q by the 
same positive integer k, which multiplies p° + q? by k*. This leads to the answer 
that the numbers expressible as p° + q°, besides 0 and 1, are all the numbers n 
for which each prime factor congruent to 3 mod 4 occurs to an even power in the 
prime factorization of n. Thus the sequence of numbers that are sums of two squares 
begins 0,1, 2,4,5, 8,9, 10,13, 16,17, 18, 20, 25, 26, 29, 32, 34, 36, 37,40,---. 


Another question one can ask about Pythagorean triples is how many there are 
with two of the three numbers differing only by 1. In the earlier table there are 
several: (3,4,5), (5,12,13), (7,24,25), (20,21,29), (9,40,41), (11,60,61), and 
(13, 84,85). As the pairs of numbers that differ by 1 get larger, the corresponding 
right triangles are either approximately 45-45-90 right triangles, as with the triple 
(20, 21, 29), or long thin triangles, as with (13, 84,85). To analyze the possibilities, 
note first that if two of the numbers in a triple (a,b,c) differ by 1 then the triple 
has to be primitive, so we can use our formula (a,b,c) = (2pq,p* — q*,p* + q°). 
If b and c differ by 1 then we would have (p° + q*) — (p* — q*) = 2q* = 1 which 
is impossible. If a and c differ by 1 then we have p° + q? — 2pq = (p - q4) = 1 
so p -q = +1, and in fact p — q = +1 since we must have p > q in order for 
b = p° — q? to be positive. Thus we get the infinite sequence of solutions (p,q) = 
(2,1), (3,2), (4,3),- -- with corresponding triples (4,3,5), (12,5,13), (24, 7,25),---. 
Note that these are the same triples we obtained earlier that realize all the odd values 
b = 3,5,7,- 

The remaining case is that a and b differ by 1. Thus we have the equation 
p° — 2pq — q? = +1. The left side does not factor using integer coefficients, so it is 
not so easy to find integer solutions this time. In the table there are only the two triples 
(4,3,5) and (20, 21,29), with (p,q) = (2,1) and (5,2). After some trial and error one 
could find the next solution (p,q) = (12,5) which gives the triple (120,119,169). Is 
there a pattern in the solutions (2,1), (5,2), (12,5)? One has the numbers 1,2,5,12, 
and perhaps itis not too great a leap to notice that the third number is twice the second 
plus the first, while the fourth number is twice the third plus the second. If this pattern 
continued, the next number would be 29 = 2-12 +5, giving (p,q) = (29,12), and this 
does indeed satisfy p? —2pq-q° = 1, yielding the Pythagorean triple (696, 697, 985). 
These numbers are increasing rather rapidly, and the next case (p,q) = (70,29) yields 
an even bigger Pythagorean triple (4060, 4059, 5741). Could there be other solutions 
of p? —2pq-q° = +1 with smaller numbers that we missed? We will develop tools in 
Chapters 4 and 5 to find all the integer solutions, and it will turn out that the sequence 
we have just discovered gives them all. 
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Although the quadratic form p° — 2pq — q? does not factor using integer coeffi- 
cients, it can be simplified slightly be rewriting it as (p — q)* — 24° . Then if we change 
variables by setting (x, y) = (p — 4,4) we obtain the quadratic form x° — 2y°. Find- 
ing integer solutions of x° — 2yv* = n is equivalent to finding integer solutions of 
p°? — 2pq - q? = n since integer values of p and q give integer values of x and y, 
and conversely, integer values of x and y give integer values of p and q since when 
we solve for p and q in terms of x and y, we again get equations with integer coef- 
ficients: (p,q) = (x + y, y). Thus the quadratic forms p° — 2pq — q* and x? — 2y" 
are completely equivalent, and finding integer solutions of p* — 2pq — q° = +1 is 
equivalent to finding integer solutions of x° — 2y° = +1. 

The equation x° — 2 y? = +] is an instance of the equation x? - Dy’ = +1 which 
is known as Pell’s equation (although sometimes this term is used only when the right 
side of the equation is +1 and the other case is called the negative Pell equation). 
This is a very famous equation in Number Theory which has arisen in many different 
contexts going back hundreds of years. We will develop techniques for finding all 
integer solutions of Pell’s equation for arbitrary values of D in Chapters 4 and 5. It 
is interesting that certain fairly small values of D can force the solutions to be quite 
large. For example, for D = 61 the smallest positive integer solution of x? -61y° = 1 
is a rather large pair: 


(x, Vv) = (1766319049, 226153980) 


As far back as the eleventh and twelfth centuries mathematicians in India knew how to 
find this solution. It was rediscovered in the seventeenth century by Fermat in France, 
who also gave the smallest solution of x° — 1097 = 1, an even larger pair: 


(x, Vv) = (158070671986249, 15140424455100) 


The way that the size of the smallest solution of x? — Dy? = 1 depends upon D is 
very erratic and is still not well understood today. 


Pythagorean Triples and Complex Numbers 


There is another way of looking at Pythagorean triples that involves complex 
numbers, surprisingly enough. The starting point here is the observation that a? + b? 
can be factored as (a+bi)(a—bi) where i = /—1. If we rewrite the equation a* +b° = 
c? as (a + bi)(a — bi) = c° then since the right side of the equation is a square, we 
might wonder whether each factor a + bi on the left side would have to be a square 
too. For example, in the case of the triple (3,4,5) we have (3 + 4i)(3 — 4i) = 5° with 
34+4i = (2+i)* and 3—4i = (2-i)’. So let us ask optimistically whether the equation 
(a+bi)(a—bi) = c° canbe rewritten as (p+ qi)*(p—qi)* = c? with a+bi = (p+qi)* 
and a — bi = (p — qi)? . We might hope also that the equation (p + qi)? (p — qi)? = c° 
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was obtained by simply squaring the equation (p + qi)(p — qi) = c. Let us see what 
happens when we multiply these various products out: 


a+bi=(p+qi)* = (p? - q°) + (2pq)i 
hence a=p*-—q* and b=2pq 


a-bi=(p-—qi)* = (p° —q*) — (2pq)i 


hence again a=p*-—q* and b=2pq 


c= (p +qi)(p - qi) = p? + qÀ 
Thus we have miraculously recovered the formulas for Pythagorean triples that we 
obtained earlier by geometric means, with a and b switched, which does not really 
matter: 

a=p -q b = 2pq c=p° +q 


Our derivation of these formulas just now depended on several assumptions that we 
have not justified, but it does suggest that looking at complex numbers of the form 
a + bi where a and b are integers might be a good idea. These complex numbers 
a+ bi with a and b integers are called Gaussian integers, after C. F. Gauss, the first 
mathematician to make a thorough algebraic study of them some 200 years ago. We 
will develop the basic properties of Gaussian integers in Chapter 8, in particular ex- 
plaining why the derivation of the formulas above is valid. 


Rational Points on Quadratic Curves 


The same technique we used to find the rational points on the circle x? + y* = 1 
can also be used to find all the rational points on other quadratic curves Ax’ + Bx yt 
Cy* + Dx + Ey = F with integer or rational coefficients A,B,C,D,E,F, provided 
that we can find a single rational point (Xo, Yọ) on the curve to start the process. 
For example, the circle x? + y* = 2 contains the ra- 
tional points (+1,+1) and we can use one of these 
as an initial point. Taking the point (1,1), we would (1,1) 
consider lines y — 1 = m(x — 1) of slope m passing 
through this point. Solving this equation for y and 
plugging into the equation x* + y? = 2 would pro- 
duce a quadratic equation ax? + bx +c = 0 whose co- 
efficients are polynomials in the variable m, so these 
coefficients would be rational whenever m is rational. From the quadratic formula 
x = (-b + Vb? — 4ac )/2a we see that the sum of the two roots is —b/a, a rational 
number if m is rational, so if one root is rational then the other root will be rational as 
well. The initial point (1,1) on the curve x° + y? = 2 gives x = 1 as one rational root 
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of the equation ax? +bx +c = 0, so for each rational value of m the other root x will 
be rational. Then the equation y — 1 = m(x — 1) implies that y will also be rational, 
and hence we obtain a rational point (x, y) on the curve for each rational value of m. 
Conversely, if x and y are both rational and x + 1 then obviously m = ¥-1/,_, will 
be rational. Thus one obtains a dense set of rational points on the circle x° + y* = 2, 
since the slope m can be any rational number. An exercise at the end of the chapter 
is to work out the formulas explicitly. 


Note that the point (1,—1) is a rational point on the circle which does not arise 
from the formulas parametrizing x and y in terms of m since it corresponds to 
m = œ. This is analogous to the earlier case of the circle x? + y? = 1 where the 
point (0,-1) corresponded to m = œ and r = 0. For the circle x? + y° = 2 we 
could just as well use the parameter r instead of m, with (7,0) the point where the 
line through (1,1) intersects the x-axis. There are simple formulas relating v and 
m, namely r = ™-1/, and m = Y,_,y. From this viewpoint the exceptional slope 
m = œ corresponds to r = 1 which is not exceptional for the parametrization by r, 
while the exceptional value r = œ corresponds to the nonexceptional value m = 0 
when the line through (1,1) is parallel to the x-axis. 


If we consider the circle x° + y* = 3 instead of x° + y* = 2 then there are no 
obvious rational points. And in fact this circle contains no rational points at all. For if 
there were a rational point, this would yield a solution of the equation a’? + b* = 3c? 
by integers a, b, and c with c + 0. We can assume a, b, and c have no common 
factor. Then a and b cannot both be even, otherwise the left side of the equation 
would be even, forcing c to be even, so a, b, and c would have a common factor 
of 2. To complete the argument we look at the equation modulo 4. As we saw earlier, 
the square of an even number is 0 mod 4, while the square of an odd number is 1 
mod 4. Thus, modulo 4, the left side of the equation is either 0+ 1, 1+0,or1+1 
since a and b are not both even. So the left side is either 1 or 2 mod 4. However, 
the right side is either 3-0 or 3-1 mod 4. We conclude that there can be no integer 
solutions of a? + b? = 3c* with c + 0. When c = O there is of course the trivial 
solution (a,b,c) = (0,0,0) but this is not interesting so we will generally disregard 
it in equations of this type. 

The technique we just used to show that a* + b* = 3c? has no nontrivial integer 
solutions can be used in many other situations as well. The underlying reasoning is 
that if an equation with integer coefficients has an integer solution, then this gives 
a solution modulo n for all numbers n. For solutions modulo n there are only a 
finite number of possibilities to check, although for large n this is a large finite num- 
ber. If one can find a single value of n for which there is no solution modulo n, 
then the original equation has no integer solutions. However, this implication is not 
reversible, as it is possible for an equation to have solutions modulo n for every num- 
ber n and still have no actual integer solutions. A concrete example is the equation 
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2x? + 7y* =1. This obviously has no integer solutions, yet it does have solutions 
modulo n for each n, although this is certainly not obvious. Note that the ellipse 
2x* +7y° = 1 does contain rational points such as (1/3, 1⁄3) and (3/5, 1⁄5). These can 
in fact be used to show that 2x* + 7y* = 1 has solutions modulo n for each n, as 
we will show in Section 2.3 of Chapter 2 when we study congruences in more detail. 

In Chapter 6 we will find a complete answer to the question of when the circle 
x? +y? =n contains rational points by showing that there are rational points on this 
circle only when there are integer points on it. This reduces the problem to one we 
considered earlier, finding the integers n that are sums of two squares. 

Determining when a quadratic curve contains rational points turns out to be much 
easier than determining when it has integer points. The general problem reduces 
fairly quickly to finding rational points on ellipses or hyperbolas of the special form 
Ax*+By* = C where A, B, and C are integers that are not divisible by squares greater 
than 1, and such that no two of A, B, and C have a common factor. A theorem of 
Legendre then asserts that the curve Ax? +B y? = C contains rational points exactly 
when three congruence conditions modulo A, B, and C are satisfied, namely AC 
must be congruent mod B to the square of some number, and likewise BC must be a 
square mod A and —AB must be a square mod C. (There is also the obvious condition 
that A and B cannot both have opposite sign from C.) For example, if C = 1 this 
reduces just to saying that each of A and B is congruent to a square modulo the 
other one since the congruence condition mod C holds automatically when C = 1. 
For the ellipse 2x? +7y° = 1 this agrees with what we saw earlier since 2 is a square 
mod 7, namely 3°, and 7 is a square mod 2, namely 17, so Legendre’s theorem 
guarantees that the curve has a rational point. In the case of the circle x? + y? = 3 
the congruence conditions reduce simply to —1 being a square mod 3, which it is not 
since every number is congruent to 0, 1, or 2 mod 3 so the squares mod 3 are just 
0 and 1 since 2* = 1 mod 3. 


Diophantine Equations 


Equations like x° + y? = z? or x° — Dy? = 1 that involve polynomials with 


integer coefficients, and where the solutions sought are required to be integers, or 
perhaps just rationals, are called Diophantine equations after the Greek mathemati- 
cian Diophantus (ca. 250 A.D.) who wrote a book about these equations that was very 
influential when European mathematicians started to consider this topic much later 
in the 1600s. Usually Diophantine equations are very hard to solve because of the 
restriction to integer solutions. The first really interesting case is quadratic Diophan- 
tine equations. By the year 1800 there was quite a lot known about the quadratic case, 
and we will be focusing on this case in this book. 
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Diophantine equations of higher degree than quadratic are much more challeng- 
ing to understand. Probably the most famous one is x” + y” = z” where n is a fixed 
integer greater than 2. In the 1600s when the French mathematician Fermat was read- 
ing about Pythagorean triples in his copy of Diophantus’ book, he made a marginal 
note that, in contrast with the equation x° + y* = z*, the equation x” +y” = z” has 
no solutions with positive integers x,y,z when n > 2. This is one of many state- 
ments that he claimed were true but never wrote proofs of for public distribution, nor 
have proofs been found among his manuscripts. Over the next century other math- 
ematicians discovered proofs for all his other statements, but this one was far more 
difficult to verify. The issue is clouded by the fact that he only wrote this statement 
down the one time, whereas all his other important results were stated numerous 
times in his correspondence with other mathematicians of the time. So perhaps he 
only briefly believed he had a proof. In any case, the statement has become known 
as Fermat’s Last Theorem. It was finally proved in the 1990s by Andrew Wiles, using 
some very deep mathematics developed mostly over the preceding couple decades. 

We have seen that finding integer solutions of x*+* = z° is equivalent to finding 
rational points on the circle x*+* = 1, and in the same way, finding integer solutions 
of x" + y” = z” is equivalent to finding rational points on the curve x" + y” = 1. 
For even values of n > 2 this curve looks like a flattened circle or rounded square, 
while for odd n it has a similar shape in the first quadrant but a rather different shape 
elsewhere, extending out to infinity in the second and fourth quadrants, asymptotic 
to the line y = —-x: 


Fermat’s Last Theorem is equivalent to the statement that these curves have no ra- 
tional points except their intersections with the coordinate axes, where x or y is 0. 
These examples show that it is possible for a curve defined by an equation of degree 
greater than 2 to contain only a finite number of rational points (either two points or 
four points here, depending on whether n is odd or even) whereas quadratic curves 
like x° + y? = n contain either no rational points or an infinite dense set of rational 
points. 

After quadratic curves the next case that has been studied in great depth is cubic 
curves such as the curves defined by equations y? = x? + ax? + bx +c. These are 
known as elliptic curves, not because they are ellipses but because of a connection 
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with the problem of computing the length of an arc of an ellipse. Depending on the 
values of the coefficients a,b,c elliptic curves can have either one or two connected 
pieces: 


In some cases the number of rational points is finite, any number from O to 10 as 
well as 12 or 16 according to a difficult theorem of Mazur. In other cases the number 
of rational points is infinite and they form a dense set in the curve, or possibly just in 
the component that stretches to infinity when there are two components. There is no 
simple way known for predicting the number of rational points from the coefficients. 
Interestingly, elliptic curves play an important role in the proof of Fermat’s Last The- 
orem. Their theory is much deeper than for quadratic curves, and so elliptic curves 
are well beyond the scope of this book. 


Rational Points on a Sphere 


Although we will not be discussing this later in the book, another way to gen- 
eralize quadratic curves, in a different direction from considering cubic and higher 
degree curves, is to keep the quadratic condition but introduce more variables. After 
quadratic curves the next case would be quadratic surfaces, or as they are usually 
called, quadric surfaces. These are surfaces in three-dimensional space defined by an 
equation Q(x, y,z) =n where Q(x,y,Z) is a quadratic function of three variables. 
Perhaps the simplest example is the equation x? + y* + z? = 1 which defines the 
sphere of radius 1 with center at the origin. Other quadric surfaces are ellipsoids, 
paraboloids, hyperboloids, and certain cones and cylinders. 


Much of the theory of quadric surfaces parallels that for quadratic curves. To 
illustrate, let us consider the problem of finding all the rational points on the sphere 
x? +y? +z’ =1, the triples (x,y,z) of rational numbers that satisfy this equation. 
Some obvious rational points are the points where the sphere meets the coordinate 
axes such as the point (0,0,1) on the z-axis. Following what we did for the circle 
x*+y* = 1, considera line from (0,0,1) toapoint (u,v,0) inthe xy-plane. This line 
intersects the sphere at some point (x, y, Z), and we want to find formulas expressing 
x, y,and z in terms of u and v. To do this we use the following figure: 
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Suppose we look at the vertical plane containing the triangle ONQ. From our earlier 
analysis of rational points on a circle of radius 1 we know that if the segment OQ has 
length |OQ| = r, then |OP’| = 2%/,2,, and z = ”°-!/,2,,. From the right triangle 
OBQ we see that u° +v? = r°. The triangle OBQ is similar to the triangle OAP’ and 
the scaling factor to go from OBQ to OAP” is 


JOP'|  2r/(r*+1)_ 2 
lOQ| — Y ~ 24) 
Hence 
2 2u 2 2V 


and y= 


x= U= a? Se eU 
r? +1 u? +v? +1 r? +1 u? +v? +1 


Also we have r 5 > 
sak -1 u +v^-1 
r?+1 u?+v?+l 


Summarizing, we have expressed x, y, and z in terms of u and v by the formulas 


gane “be g 
u? +v? +1 u? +v? +1 u? +v? +1 
We can also express u and v in terms of x, y, and z. The projection of the point P = 
(x,¥,Z) onto the xz-plane is the point (x,0, z) whichis on the line through B and N. 
The slope of this line is — 1/, so the equation for the line is z = 1 — */,. Solving 
this for u gives u = X*/,_-7. Interchanging x and y corresponds to interchanging u 
and v so we also have v = Y/,_>. 

From the formulas relating (x,y,z) to (u,v) we see that x, y, and z are rational 
exactly when u and v are rational. Thus we have formulas for all the rational points 
(x,¥,zZ) on the sphere except for the pole (0,0,1) in terms of rational parameters u 
and v. 
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Here is a short table giving a few rational points on the sphere and the corre- 
sponding integer solutions of the equation a° + b° + c° = a’: 


(u,v) (x,¥V,Z) (a,b,c,d) 
(1,2) CEN EFEN (1,2,2,3) 
(2,3) Can? (2,3,6,7) 
(1,4) (9,9,39) (1,4,8,9) 
(2,2) (%, Y9, %9) (4,4,7,9) 
(1,3) (Ae 1) (2,6,9,11) 
(AY | (Aras 15-71) (6,6,7,11) 
(3,4) avia A3) (3,4, 12, 13) 
(2,5) (is 15) (2,5,14,15) 
(Yo, 7/2) (sie is) (2,10,11,15) 


These are in fact all the primitive positive solutions of a° + b? +c° = d* with d < 15, 
up to permutations of a, b, and c. 

As with rational points on the circle x? + y? = 1, rational points on the sphere 
x? + y? + z? = 1 are dense since rational points are dense in the xy-plane. Thus 
there are lots of rational points scattered all over the sphere. In linear algebra courses 
one is often called upon to create unit vectors (x,y,z) by taking a given vector and 
rescaling it to have length 1 by dividing it by its length. For example, the vector 
(1,1,1) has length v3 so the corresponding unit vector is (3,3, '43) . It is rare 
that this process produces unit vectors having rational coordinates, but the formulas 
derived above give a way to create as many rational unit vectors as we like. 

The correspondence we have described between points (x,y,z) ona sphere and 
points (u,v) in the plane is called stereographic projection. One can think of the 
sphere and the plane as being made of clear glass, and if one looks outward and 
downward from the north pole of the sphere the points of the sphere are projected 
onto points in the plane, and vice versa. The north pole itself does not project onto 
any point in the plane, but points approaching the north pole project to points ap- 
proaching infinity in the plane, so one can think of the north pole as corresponding to 
an imaginary infinitely distant “point” in the plane. This geometric viewpoint some- 
how makes infinity less of a mystery, as it just corresponds to a point on the sphere, 
and points on a sphere are not very mysterious. (Though in the early days of polar 
exploration the north pole may have seemed very mysterious and infinitely distant.) 

One might ask also about spheres x? + y? + z? = n, following what we did 
for circles x? + y? = n. Finding an integer point on x* + y? + z? = n is asking 
whether n is a sum of three squares. One can test small values of n and one finds 
that most numbers are sums of three squares, so it is easier to list the ones that are 
not: 7,15, 23, 28,31, 39, 47,55, 60, 63, 71, 79,87,92,95,---. The odd numbers here 
are just the numbers 8k + 7, and the even numbers seem to be 4 times the earlier 
numbers on the list. In fact it is easy to see that numbers congruent to 7 mod 8 cannot 
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be expressed as sums of three squares by the following argument. The squares mod 8 
are 0* = 0, (+1)* = 1, (+2)? = 4, (+3)? = 9 = 1, and 4° = 16 = 0, so the squares 
of even numbers are 0 or 4 mod 8 and the squares of odd numbers are 1 mod 8. 
Obviously 7 cannot be realized as a sum of three terms 0, 1, or 4, so numbers 
congruent to 7 mod 8 cannot be sums of three squares. 

To rule out numbers 4(8k+7) as sums of three squares, we can work mod 4 where 
the squares are just 0 and 1. If we have x? + y? + z? = 4n then x? + y* +z? =0 
mod 4, and the only way to get 0 as a sum of three numbers 0 or 1 isas 0+0+0. 
This means each of x, y, and z must be even, so we can cancel a 4 from both sides 
of the equation x* + y? +z? = 4n to get n expressed as a sum of three squares. 
Thus numbers 4(8k + 7) are never realizable as sums of three squares since 8k + 7 
is never a sum of three squares. Repeating this argument, we see that 16(8k + 7) is 
never a sum of three squares since 4(8k + 7) is not a sum of three squares. Similarly 
4'(8k + 7) is never a sum of three squares for any larger exponent l. 

The converse statement that every number not of the form 4'(8k +7) is express- 
ible as a sum of three squares is true but is much harder to prove. This was first done 
by Legendre. 

This answers the question of when the sphere x° + y? + z? = n contains integer 
points, but could it contain rational points without containing integer points? Let us 
show that this cannot happen. A rational point on x° + y? + z* = n is equivalent to 
an integer solution of a* + b* +c* = nd’. It will suffice to show that if n is not a sum 
of three squares, then neither is nd? for any integer d. An equivalent statement is 
that if n is of the form 4'(8k +7) then so is nd’. To prove this, let us write d as 2?q 
with q odd and p = 0, hence d° = 4”q° with q? = 1 mod 8 since q is odd. Thus we 
have nd? = 4'*? (8k + 7)q° where the product (8k + 7)q* is 7 mod 8 since 8k +7 
is 7 mod 8 and q° is 1 mod 8. This shows what we wanted, that if n is of the form 
4'(8k +7) then so is nd?. 

For a general quadric surface defined by a quadratic equation with integer coef- 
ficients there is a theorem due to Minkowski, analogous to Legendre’s theorem for 
quadratic curves, that says that rational points exist exactly when certain congruence 
conditions are satisfied. In general, having rational points on a quadric surface is not 
equivalent to having integer points as it was for spheres, and the existence of integer 
points is a more delicate question. 

Moving on to four variables, one could ask about integer or rational points on the 
spheres x? + y? + z? + w° = n in four-dimensional space. Integers that could not 
be expressed as the sum of three squares can be realized as sums of four squares, 
for example 7 = 2? +1°+4+17+41° and 15 = 3? + 2° + 1° + 1°, and it is a theorem 
of Lagrange that every positive number can be expressed as the sum of four squares. 
Thus the spheres x* + y? + z? + w* = n always contain integer points. 


Minkowski’s theorem remains true for quadratic equations with integer coeffi- 
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cients in any number of variables, as does the fact that the existence of a single rational 
solution implies that rational solutions are dense. 


Exercises 


1. (a) Make a list of the 16 primitive Pythagorean triples (a,b,c) with c < 100, 
regarding (a,b,c) and (b,a,c) as the same triple. 

(b) How many more would there be if we allowed nonprimitive triples? 

(c) How many triples (primitive or not) are there with c = 65? 


2. (a) Find all the positive integer solutions of x° — y? = 512 by factoring x° — y* as 
(x + y)(x — y) and considering the possible factorizations of 512. 

(b) Show that the equation x° — y* = n has only a finite number of integer solutions 
for each value of n > 0. 

(c) Find a value of n > 0 for which the equation x? — y? = n has at least 100 different 
positive integer solutions. 


3. (a) Show that there are only a finite number of Pythagorean triples (a,b,c) with a 
equal to a given number n. 
(b) Show that there are only a finite number of Pythagorean triples (a,b,c) with c 
equal to a given number n. 


4. Find an infinite sequence of primitive Pythagorean triples where two of the numbers 
in each triple differ by 2. 


5. Find a right triangle whose sides have integer lengths and whose acute angles are 
close to 30 and 60 degrees by first finding the irrational value of r that corresponds to 
aright triangle with acute angles exactly 30 and 60 degrees, then choosing a rational 
number close to this irrational value of r. 


6. Find a right triangle whose sides have integer lengths and where one of the two 
shorter sides is approximately twice as long as the other, using a method like the one 
in the preceding problem. (One possible answer might be the (8,15,17) triangle, or 
a triangle similar to this, but you should do better than this.) 


7. Find a rational point on the sphere x? + y? + z? = 1 whose three coordinates are 
nearly equal. 


8. (a) Derive formulas that give all the rational points on the circle x? + y* = 2 in 
terms of arational parameter m, the slope of the line through the point (1,1) on the 
circle. (The value m = œ should be allowed as well, yielding the point (1,—1).) The 
calculations may be a little messy, but they eventually simplify to give formulas that 
are not too complicated: 


m°? -2m-1 -m° -2m+1 


m2 +1 Y=— m1 
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(b) Using these formulas, find five different rational points on the circle in the first 
quadrant, and hence five solutions of a? + b* = 2c? with positive integers a, b, c. 
(c) The equation a° + b? = 2c? can be rewritten as c° = Y>(a* + bê), which says that 
c° is the average of a? and b°, or in other words, the squares a°, c*, b? form an 
arithmetic progression. One can assume a < b by switching a and b if necessary. 
Find four such arithmetic progressions of three increasing squares where in each case 
the three numbers have no common divisors. 


9. (a) Find formulas that give all the rational points on the upper branch of the hyper- 
bola y* -x° =1. 

(b) Can you find any relationship between these rational points and Pythagorean 
triples? 


10. (a) Show that the equation x? —2y* = +3 has no integer solutions by considering 
this equation modulo 8. 

(b) Show that there are no primitive Pythagorean triples (a,b,c) with a and b differ- 
ing by 3. 


11. Show there are no rational points on the circle x? + y* = 3 using congruences 
modulo 3 instead of modulo 4. 


12. Show that for every Pythagorean triple (a,b,c) the product abc must be divisible 
by 60. (It suffices to show that abc is divisible by 3, 4, and 5.) 


13. Use congruences modulo 8 to show that primitive solutions of a° + b? + c° = d° 


must have d odd and must have two of a,b,c even and the other odd. 


14. Show that if the curve x" + y” = 1 has a rational point with x and y nonzero, 
then it has a rational point with x and y positive. Hint: Consider the equation 
a” +b" =c". 


20 | 


il The Farey Diagram 


Our goal is to use geometry to study numbers. Of the various kinds of numbers, 
the simplest are integers, along with their ratios, the rational numbers. Usually one 
thinks of rational numbers geometrically as points along a line, interspersed with 
irrational numbers as well. In this chapter we introduce a two-dimensional pictorial 
representation of rational numbers that displays certain interesting relations between 
them that we will be exploring. This diagram, along with several variants of it that 
will be introduced later, is known as the Farey diagram. The origin of the name will 
be explained when we get to one of these variants. Here is the diagram: 


-2/3 
> 4/3 VAs 3/4 9/7 


What is shown here is not the whole diagram but only a finite part of it. The actual 
diagram has infinitely many curvilinear triangles, getting smaller and smaller out near 
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the boundary circle. The diagram can be constructed by first inscribing the two big 
triangles in the circle, then adding the four triangles that share an edge with the two 
big triangles, then the eight triangles sharing an edge with these four, then sixteen 
more triangles, and so on forever. With a little practice one can draw the diagram 
without lifting one’s pencil from the paper: First draw the outer circle starting at the 
left or right side, then the diameter, then make the two large triangles, then the four 
next-largest triangles, and so on. 

Our first task will be to explain how the vertices of all the triangles are labeled 
with rational numbers. Perhaps the reader can guess what the rules are before we 
spell them out in detail. 


1.1 The Mediant Rule 


The vertices of the triangles in the Farey diagram are labeled with fractions %/,, 
including the fraction // for œ, according to the following scheme. In the upper half 
of the diagram, first label the vertices of the big triangle 1/, 94,, and 11. Then add 
labels for successively smaller triangles by the rule that, if aly, 
the labels at the two ends of the long edge of a triangle are atc 
af, and ‘/q, then the label on the third vertex of the triangle b+d 
is 4+°/,,q4, so the numerators and denominators are added 


separately, contrary to the usual way of adding fractions. The cy 
fraction 4+°/p+q is called the mediant of 4/4, and Yq. d 

The labels in the lower half of the diagram follow the same scheme, starting with 
the labels -1⁄%, %1, and -1⁄1 on the large triangle. Using -1% instead of 1% as the 
label of the vertex at the far left means that we are regarding +œ and —o as the 
same. The labels in the lower half of the diagram are the negatives of those in the 
upper half, and the labels in the left half are the reciprocals of those in the right half. 

For fractions with a nonzero denominator our usual rule will be to write them 
with a positive denominator, so the sign of the fraction is the sign of the numerator. 

The labels generated by the mediant rule occur in their proper order around the 
circle, increasing from —œ to +œ as one goes around the circle in the counterclock- 
wise direction. This is obviously true for the integer labels, and to verify it for the 
others it suffices to show that the mediant 4*°4,,g of %, and $/g is always a num- 
ber between %/p, and °/g (hence the term “mediant”). Thus we want to show that if 
af, < Yq then Yp < **Yuig < Yq. These fractions all have positive denominators, 
so the inequality 4/, < “/g is equivalent to ad < bc and Y < 4+Yp14q is equivalent 
to ab +ad < ab + bc. Obviously ad < bc implies ab + ad < ab + bc, so p < Yq 
implies Y% < 4+. gq. Similarly 4+ Y,4g < Yq is equivalent to ad + cd < bc + cd 
which also follows from ad < bc, so Y% < Yq implies *t Yyig < Yq. 
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There is another version of the Farey diagram with the boundary circle straight- 
ened out to a line: 


Here the diagram fills up the upper half of the xy-plane, with the vertex 1% of the 
original Farey diagram positioned “at infinity” so it is not actually shown in the new 
version. The edges of the diagram with one endpoint at 1% are drawn as vertical lines 
with lower endpoints at the integer points on the x-axis. All the other edges of the 
diagram are semicircles with endpoints on the x-axis, and we can position these so 
that the vertex labeled 7, is actually the number %/, on the x-axis. This is possible 
since when we construct the diagram by adding more and more curvilinear triangles, 
we can place the new vertex of each triangle at any point between its outer two vertices, 
so we just choose this new vertex to be at the mediant of the outer two vertices. 


In the previous chapter we described how rational points (x,y) on the unit cir- 
cle x? + y* = 1 correspond to rational points P/q on the x-axis by means of lines 
through the point (0,1) on the circle. Using this correspondence, we can label the 
rational points on the circle by the corresponding rational points on the x-axis and 
then construct a new Farey diagram in the circle by filling in triangles by the mediant 
rule just as before. 


This gives a version of the circular Farey diagram that is rotated by 90 degrees to put 
l% at the top of the circle, and there are also some perturbations of the positions of the 
other vertices and the shapes of the triangles. For our purposes these perturbations 
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will not make much of a difference since it will usually be just the combinatorial 
pattern of the triangles that is important. We drew the circular Farey diagram the way 
we did at the beginning of the chapter because it looks more symmetric and is easier 
to draw since one does not have to figure out the exact positions of the vertices. 

The next figure shows the relationship between the new circular Farey diagram 
and Pythagorean triples (a, b,c) using the formulas (a,b,c) = (2pq, p* -q°, p° +q°) 
that we found in the previous chapter. The vertex with label ¥/g thus has coordinates 
G92 (4a) ( PA/y2 +q? PP 4/9249? ) . 


The construction we have described for the Farey diagram involves an inductive 
process where more and more edges and vertex labels are added in succession. With 
a construction like this it is not easy to tell by a simple calculation whether or not two 
given rational numbers %/p and ‘/g are joined by an edge in the diagram. Fortunately 
there is such a criterion: 


Proposition 1.1. For each pair of fractions 4, and ©/q, including *Yo, there exists 
an edge in the Farey diagram with endpoints labeled %, and ‘/q if and only if the 
determinant ad — bc of the matrix G 5) is equal to +1. 


What this means is that if one starts with the rational numbers together with 
+% arranged in order around a circle and one inserts circular arcs inside this cir- 
cle meeting it perpendicularly and joining each pair of fractions 4%, and ©“ such 
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that ad — bc = +1, with the circular arc replaced by a diameter in case 4%, and 
C/a are diametrically opposite on the circle, then no two of these arcs will cross, 
and they will divide the interior of the cir- 
cle into nonoverlapping curvilinear trian- 
gles. This is really quite remarkable when 
you think about it, and it does not hap- 
pen for other values of the determinant 
besides +1. For example, for determi- 
nant +2 the edges would be the dotted 
arcs in the figure at the right. Here there 
are three arcs crossing in each triangle of 
the original Farey diagram, and these arcs 
divide each triangle of the Farey diagram 
into six smaller triangles. 


-2 
ae -4/3 -1/1 73⁄4 


Proof: First we show by an inductive argument that for an edge in the diagram joining 
two fractions 4/4, and °/g the associated matrix (F S) has determinant +1. The 
induction starts with the edge joining * to 94 where the determinant condition 
obviously holds. All the other edges are added in stages, first the four edges creating 
the two biggest triangles, then the eight edges creating the next four triangles, and 
so on. Consider a triangle created at some stage by adding a new vertex labeled 
a+C/,,q as the mediant of vertices 44, and “/g from ajy 
an earlier stage, as in the figure at the right. We may 
assume by induction that ad — bc = +1 for the long 
edge of the triangle which was added at an earlier stage. 
The determinant condition then holds also for the two 
shorter edges of the triangle since a(b +d)-b(a+c) = 
ad — bc and (a + c)d — (b + d)c = ad — bc. Thus the 
determinant condition continues to hold after each stage 
of the construction of the diagram, so it holds for all 
edges. 

Now we prove the converse, the statement that if ad — bc = +1 then there is 
an edge in the diagram joining 4/⁄p and “g. We may assume b > 0 and d > 0 by 


multiplying both numerator and denominator of either fraction by —1 if necessary, 
which multiplies the determinant ad — bc by —1. The order of the two fractions %/, 
and °/g does not matter since interchanging the two columns of the matrix 6 4) also 
multiplies the determinant by —1. If b or d is 0, say b = 0, then the determinant 
condition becomes ad = +1 so d = 1 and a = +1. In this case the fractions %/, 
and ©/g are *!/ and ©, so they lie at the ends of an edge of the diagram, one of the 
vertical edges to Y in the upper halfplane version of the diagram. Thus for the rest 
of the proof we may assume b > 0 and d > Q. 
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The previous figure shows that adding a new triangle to the diagram creates two 
new edges corresponding to matrices obtained from (5 a) by replacing one of the 
columns by the sum of the two columns. To finish the proof we will show that for 
each matrix (5 5) of determinant +1 with b > 0 and d > 0 itis possible to perform 
a finite sequence of the inverse operations of subtracting one column from the other 
and end up with a matrix that we already know corresponds to an edge in the diagram. 
We will do this by always subtracting the column with smaller second entry from the 
column with larger second entry, so that these two entries remain positive. We stop 
the process when the two entries in the second row become equal. For example, here 


is how the process works for the matrix (k ig) : 


(3 10) > (3) (65) (53) > (25) G21) G4) 


Here the last matrix corresponds to the edge joining 14 and 9/,. Reversing the steps 
reducing ( i a ) to G i) , we are adding one column to the other at each stage so each 


new matrix produced in this way corresponds to an edge of the diagram. In particular 
37 
819 


For the general argument we start with a matrix (5 5) of determinant +1 with 


this shows that the original matrix ( ) corresponds to an edge of the diagram. 

b > 0 and d > 0. If b + d then we subtract the column with smaller second entry from 
the column with larger second entry, and repeat this operation until the two entries in 
the second row are equal. We cannot get a 0 in the second row since this would mean 
that the previous matrix already had equal entries in the second row. Once we get a 
matrix with equal entries in the second row, these entries will divide the determinant 
which is +1 so these entries must be 1. Thus the matrix is of the form (4 d , with 
determinant a—c = +1 so a and c differ by 1. The corresponding fractions are then 
n/, and +1 for some integer n, and there is an edge of the diagram joining these 
two fractions, one of the large semicircles in the upper halfplane diagram. Hence 
when we reverse the sequence of column subtractions by performing a sequence of 
column additions, each successive matrix will correspond to an edge of the diagram 
and in particular (5 5) will correspond to an edge of the diagram. o 


The sign of the determinant ad — bc has a simple interpretation for fractions 4/p 
and °/qg with positive denominators since in this case the inequality ad — bc > 0 is 
equivalent to 4%, > Yq and ad — bc < 0 is equivalent to Y%, < Yq. Thus the sign of 
the determinant tells which of %/p or 5g is larger. 


Here is an interesting consequence of the preceding proposition: 


Corollary 1.2. The mediant rule for labeling the vertices in the Farey diagram 
always produces labels %/, that are fractions in lowest terms. 


This would follow automatically if it was always true that the mediant of two 
fractions in lowest terms is again in lowest terms, but this is not always the case. For 
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example, the mediant of 1/3 and 7/3 is 3/6, and the mediant of 7/7 and 3g is 5. 
Somehow cases like this do not occur in the Farey diagram. 

Before deducing the corollary let us introduce a bit of standard terminology. Fora 
fraction %/p to be in lowest terms means that a and b have no common factor greater 
than 1. This is equivalent to saying that the prime factorizations of a and b have no 
prime factor in common. When this is the case we say that a and b are coprime. An 
alternative terminology is to say that a and b are relatively prime. 


Proof: From the way the Farey diagram is constructed, each labeled vertex 4%/p is 
joined to some other labeled vertex “/g by an edge of the diagram. By the easier half 
of Proposition 1.1 we have ad — bc = +1. This implies that a and b are coprime 
since any common divisor of a and b must divide the products ad and bc, hence 
also the difference ad — bc = +1, but the only divisors of +1 are +1. o 


Proposition 1.1 can also be used to prove another basic fact about the Farey dia- 
gram: 


Proposition 1.3. Every fraction P/q in lowest terms occurs as the label on some 
vertex in the Farey diagram. 


Proof: We may assume p and q are nonzero since ~; and Yo certainly occur as labels 
in the diagram. Since the negative labels in the diagram are just the negatives of the 
positive labels, we can assume p and q are in fact positive. It will suffice to show that 
if p and q are coprime, then there is an edge in the diagram whose endpoints are 
labeled ?/g and '/; for some integers r and s. By Proposition 1.1 this is equivalent 
to the existence of integers r and s such that ps — qr = +1. 

Consider a matrix (% A ) where the integers x and y are yet to be determined. 
In the proof of Proposition 1.1 there was a procedure for repeatedly subtracting the 
column with smaller second entry from the column with larger second entry until a 
matrix with equal second entries is obtained. Subtracting one column from the other 
does not affect coprimeness of the two second entries, so when the procedure is ap- 
plied to a matrix K a ) with p and q coprime, the result is a matrix whose second 
entries are equal and coprime, so these entries must be 1. Now let us choose a matrix 
of determinant +1 whose lower two entries are 1, say the matrix G o . If we start 
with this matrix and apply the reverse of the sequence of operations performed on 
e yY ) to get 1’s in the second row, the resulting sequence of operations of adding 


pqa 


one column to the other converts (i o into a matrix > à of the same determi- 
nant +1. This means that we have found integers r and s such that rq — ps = +1, 


or equivalently ps — qr = +1. oO 


Implicit in this proof is a method for solving Diophantine equations of the form 
px —qy = +1 for any two given coprime positive integers p and q. In Section 2.3 
we will make this procedure explicit and streamline it to be more efficient. 
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Exercises 


1. There is another version of the Farey diagram in which the vertex labeled ?/g is 
placed at the point (q,p) in the plane, so P/g is the slope of the line through the 
origin and (q,p). The edges of this new Farey diagram are straight line segments 
connecting the pairs of vertices that are connected in the original Farey diagram. For 
example there is a triangle with vertices (1,0), (0,1), and (1,1) corresponding to the 
big triangle in the upper half of the circular Farey diagram. With this model of the 
Farey diagram the operation of forming the mediant of two fractions just corresponds 
to standard vector addition (a,b) + (c,d) = (a+c,b +d). 

What you are asked to do in this problem is just to draw the portion of the new 
Farey diagram consisting of all the triangles whose vertices (q,p) satisfy 0 < q < 5 
and 0 < p < 5. Note that since fractions ?/g labeling vertices are always in lowest 
terms, the points (q, p) such that q and p have a common divisor greater than 1 are 
not vertices of the diagram. 


2. Consider a vertex of the Farey diagram labeled 4/, with b > 1. Show that of all 
the labels on vertices connected to the 4/p vertex by an edge of the diagram, exactly 
two have denominator smaller than b. 


3. If p, Ya, and ?/f are fractions in lowest terms such that °/¢ is the mediant of 
a/, and %4, is it necessarily true that there is a triangle in the Farey diagram with 
vertices 4/,, q, and ©/¢? Give either a proof or a counterexample. 


4. (a) Reduce each of the matrices Ge 5) and ish =) to either G J or d at by 
repeatedly subtracting one column from the other as in the proof of Proposition 1.1. 
(b) Use Proposition 1.1 to show that this can be done for any matrix (G §) with non- 


negative entries and determinant +1. 


1.2 Farey Series 


We can build the set of rational numbers by starting with the integers and then 
inserting in succession the halves, 
thirds, fourths, fifths, sixths, and 1 1 
so on. Let us look at what happens > 
if we restrict to rational numbers i 2 
between 0 and 1. Starting with 0 L 3 
and 1 we first insert 1⁄2, then 1/3 
and */3, then 1⁄4 and 3⁄4, skipping 5 5 5 5 
2/4 which we already have, then in- 5 5 
serting 1⁄5, 2⁄5, 3⁄5, and 4⁄5, then 1 2 3 4 5 6 
I and %4, etc. 
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This process has an interesting property that is really quite surprising when one 
first sees it: 


Each time a new number is inserted, it forms the third vertex of a triangle whose 
other two vertices are its two nearest neighbors among the numbers already listed, 
and if these two neighbors are %/, and “/g then the new vertex is exactly the 
mediant 4+ pd. 


The discovery of this curious phenomenon in the early 1800s was initially attributed 
to a geologist and amateur mathematician named Farey, although it turned out that 
he was not the first person to have noticed it. In spite of this confusion, the sequence 
of fractions %/, between 0 and 1 with denominator less than or equal to a given 
number n is usually called the nth Farey series F,,. For example, here is F7: 
O0 1 1 1 1 2 1 2 3 14 3 2535 6 4 35 6 1 
1765473572753745671 
These numbers trace out the up-and-down path across the bottom of the figure above. 
For the next Farey series Fy we would insert !/g between 9 and 1⁄3, */g between 1/3 
and 7/5, Vg between 3/5 and 7/3, and finally g between %/7 and Yj. 
There is a cleaner way to draw the preceding diagram using straight lines in 
a square, as shown in the figure at the 


right. One can construct this diagram in 
stages, as indicated in the sequence of fig- 
ures below. Start with a square together 
with its diagonals and a vertical line from 
their intersection point down to the bot- 
tom edge of the square. Next, connect 
the resulting midpoint of the lower edge 
of the square to the two upper corners of 
the square and drop vertical lines down 
from the two new intersection points this 
produces. Now add a W-shaped zigzag 


and drop verticals again. It should then 


=. |o 


be clear how to continue. 


ZN) AN ANN A 


A nice feature of this construction is that if we start with a square whose sides 


have length 1 and place this square so that its bottom edge lies along the x-axis with 
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the lower left corner of the square at the origin, then the construction assigns labels 
to the vertices along the bottom edge of the square that are exactly the x- coordinates 
of these points. Thus the vertex labeled 1⁄ really is at the midpoint of the bottom 
edge of the square, and the vertices labeled 1/3 and 7/3 really are 1/3 and 7/3 of the 
way along this edge, and so forth. In order to verify this fact the key observation 
is the following: For a vertical line segment in the diagram whose lower endpoint is 
at the point (4/,,0) on the x-axis, the upper end- c 1 

point is at the point (%/p, 1⁄p). This is obviously 1 ) (aa) 

>b 


—_ 
sla 


true at the first stage of the construction, and it con- 


tinues to hold at each successive stage since for a 


quadrilateral whose four vertices have coordinates (= 1 ) 


b+d’'b+d 
as shown in the figure at the right, the two diago- 
nals intersect at the point (4+°4,,g,'/y+q). For 
example, to verify that (4*°4,,g, Yyiq) is on the (2 0) (2 0) 
b’ d’ 


upward diagonal line from (4p, 0) to (3⁄4, 1/4) it 
suffices to show that the line segments from (4,0) to (4+°,.g,'/p+q) and from 
(4+ d, /b+4) to (Yq, 1/4) have the same slope. These slopes are 


oZ Vwa č ___b_ _ ç __b 
ath 4-4, bla+c)-alb+d) bc-ad 
4- p+d b+d-d b 


me “naa obod dae) oead 

so they are equal. The same argument works for the other diagonal by interchanging 
a/, and “/g. Note that the denominator bc — ad in the slope formulas above is +1 
since %, and ‘5g are the endpoints of an edge of the Farey diagram. Thus each 
diagonal line in the square Farey diagram has integer slope, and this integer is, up to 
sign, the denominator of the rational number where the line meets the x-axis. 

Going back to the square diagram, this fact that we have just shown implies that 
the successive Farey series can be obtained by taking the vertices that lie above the 
line y = /, then the vertices above y = 1⁄3, then above y = 1/4, and so on. 

We can form a linear version of the full Farey diagram by placing copies of the 
square side by side along the x-axis: 


1 1 1 1 1 i 1 
0 0 0 0 0 0 0 


Paar 


1 1 Tr 
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Here the vertical segments in the horizontal strip are not part of the resulting Farey 
diagram, which consists just of the triangles with nonvertical edges, along with the 
infinite “triangles” above the strip with a vertex at 1/4. The original halfplane Farey 
diagram can be obtained from this linear Farey diagram by shrinking each vertical 
segment in the horizontal strip down to its lower endpoint while bending each straight 
edge of a triangle into a semicircle with endpoints on the x-axis. 


Another version of the Farey diagram can be constructed from an array of circles 
in the upper halfplane tangent to the x-axis and to each other as in the following 
figure: 


This arrangement of tangent circles can be built in stages, starting with circles of 
diameter 1 tangent to the x-axis at the integer points. At the next stage a smaller 
circle is inserted in each gap between adjacent pairs of circles from the first stage. 
This creates new gaps, and one then puts a still smaller circle in each of these gaps. 
The process can then be repeated indefinitely all along the x-axis. 

If we connect the centers of each pair of tangent circles by a line segment passing 
through the point of tangency, we obtain a pattern of triangles that is combinatorially 
equivalent to the pattern of triangles in the linear Farey diagram, but compressed 
closer to the x-axis. The vertices of these triangles are the centers of the various 
tangent circles, and we can label these centers by rational numbers, starting with an 
integer label "/, at the center of the large circle tangent to the x-axis at the point n, 
and then labeling all the other centers by applying the mediant rule repeatedly. 

The surprising thing about this construction is that the circle whose center is 
labeled 4, is tangent to the x-axis at exactly the point %/, on the x-axis. This can 
be verified as follows. For an edge of the Farey diagram with endpoints labeled %/, and 
c/q let us draw two circles tangent to each other and 
tangent to the x-axis at the points %/, and 4/4. Let 
the radii of these two circles be r and s respectively. 


Note that r and s are not uniquely determined by a, 
af, and ©/q. In fact we can choose r arbitrarily and eee 
then this determines s, with s becoming small as r 


becomes large, and vice versa. We can find a formula 


sla 
a 
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for how r and s are related by applying the Pythagorean theorem to the right triangle 
shown in the figure. The horizontal side of this triangle has length |°/g — %/,| and 
the vertical side has length |r — s|. The condition for the two circles to be tangent is 
that the hypotenuse of the triangle has length r + s. Thus we have: 


This simplifies to: 


Since we assumed the fractions %, and °/qg were the endpoints of an edge in the 
Farey diagram, we have ad — bc = +1 so the preceding equation simplifies further 
to (H) = 4rs. The easiest way to assure that this holds is to let r = ⁄2p2 and 
s = V42, so that r depends only on %/p and s depends only on °/g. Thus we 
are choosing the diameter of each circle to be the reciprocal of the square of the 
denominator of the fraction where the circle is tangent to the x-axis. This is consistent 
with how we chose the initial large circles tangent to the x-axis at integer points. 
Then when we build the Farey diagram inductively by adding more and more vertices 
labeled according to the mediant rule, each new vertex labeled 4*°4,4g between 
vertices labeled 7/4, and ‘/q is the center of a circle of diameter ⁄p+q)2 tangent to 
the x-axis at 4*°4,,q and tangent to each of the two circles labeled 4, and Yq of 
diameters 14,2 and Yg2 that are tangent to the x-axis at 7, and “/g. 

The circles tangent to the x-axis constructed in this way are called Ford circles 
after their discoverer L.R. Ford. From the formula for their diameters we see that the 
Ford circles whose diameter is greater than a fixed number are just the ones associated 
to the fractions in a Farey series, if we restrict attention to the circles tangent to the 
x-axis at points between 0 and 1. 

Another very nice feature of Ford circles is that when we superimpose them on 
the upper halfplane Farey diagram, the semicircles of the Farey diagram intersect the 
Ford circles orthogonally at the points of tangency of the Ford circles, as shown in the 
following figure: 


The fact that the circles and semicircles intersect orthogonally at the tangency points 
of the circles can be verified by considering the tangent lines to the circles at the 
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points where two circles are tangent. The key fact 
is that for any two nonparallel tangent lines to a 
circle, the distances from the points of tangency to 
the intersection point of the two tangent lines are 
equal. This is because reflecting across the radial 
line through the intersection point takes one tan- 


gent line to the other. 


Exercises 


1. Compute the Farey series Fj. 


2. Draw a figure showing how Ford circles are positioned in a circular Farey diagram 
by the following procedure. Start with a circle C of radius 1 which will be the outer 
boundary of the Farey diagram. Next, draw two tangent circles of radius > inside C 
and tangent to C at two opposite points of C. Label these two tangency points 1/) and 
0/,. Now continue drawing smaller circles inside C with the same tangency patterns 
as the Ford circles in the upper halfplane Farey diagram, and label the tangency points 
of these circles with C according to the mediant rule. After a number of these circles 
have been drawn, superimpose the semicircles of the Farey diagram itself. 


3. In the diagram of Ford circles consider a vertical line x = r for r a real number. 
Show that this line intersects a finite number of Ford circles if r is rational and an 
infinite number of Ford circles if r is irrational. Deduce that for each irrational num- 
ber r there exists an infinite sequence of rational numbers Pn/q,, approaching r and 
approximating r in the sense that the following inequality holds for each n: 
pe 

Specifically, these are the fractions Pn/q„ labeling the circles that the line x = r 
crosses. 


4. Suppose two Ford circles tangent to the x-axis at points %/, and °/fg are tangent 
to each other. Show that the point of tangency between the two circles is the point 
ab+cd 1 
Ga eae 
so in particular the coordinates of this point are rational. Hint: What proportion of 
the way along the line segment joining the two centers is the point of tangency? This 
same proportion will apply to x-coordinates and y-coordinates separately. 
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2 Fractions 


Continued fractions are expressions of the following sort, with all numerators 
equal to 1: 


2 1 
Lt- 


To compute the value of a continued fraction one starts in the lower right corner and 
works one’s way upward. For example, in the continued fraction for ’/;g one starts 
with 3 + V> = %, then taking 1 over this gives 7/7, and adding 2 to this gives 15, 
and finally 1 over this gives 7/;g. In the case of the continued fraction for 67⁄4 the 
fractions arising by this process are °/4, “5, 19%, %19, 219, 124, and finally 67/4. 
As we will see, there is a fairly simple way to express every rational number as a 
continued fraction. 

The main theme of this chapter will be the close relationship between continued 
fractions and the Farey diagram. For example, the fact that all rational numbers occur 
as labels on vertices in the Farey diagram is a reflection of the fact that every rational 
number has an expression as a continued fraction. In fact the continued fraction for 
a rational number ?/g will tell how to locate the vertex labeled P/Q. 

We will also consider continued fractions with infinitely many terms extending 
downward to the right. These will give expressions for irrational numbers, somewhat 
like expressing irrational numbers as infinite decimals. Continued fractions have the 
advantage that rational numbers are expressible as finite continued fractions whereas 
the decimal representations for rational numbers are not generally finite but are in- 
stead just eventually periodic. Infinite continued fractions that are eventually periodic 
correspond to a special class of irrational numbers, those that are roots of quadratic 
equations with integer coefficients, like v2. Thus continued fractions are better than 
decimals in some ways, but on the other hand simple operations like addition and 
multiplication of rational numbers do not have nice descriptions in terms of contin- 
ued fractions. In spite of these limitations continued fractions are quite useful in 
Number Theory. Among other things, they can be used to solve certain Diophantine 
equations including linear ones as we will see in Section 2.3. 
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2.1 Finite Continued Fractions 


Here is the general form of a continued fraction: 


i Ne E: 1 
q 1 


“SAn 
The numbers a; are assumed to be positive integers except for agọ which can be any 
integer, positive, negative, or zero. When it is zero it can be omitted from the formula. 
To write a continued fraction in more compact form on a single line, one can write it 
as P/q = ao + a, + a +o + a, with diagonal arrows to indicate the extended 
horizontal bars in the previous notation, for example “jg = 14 + 1⁄3 + 17 and 
674 = 2+ 17 +17 +17 +174. An even more concise notation that is sometimes used 
is [đ49; 41,42, '''*, An], or just [a], á>,- +,a„] when there is no ag term. However, 
we will use the more suggestive arrow notation in this book. 


To compute the continued fraction for a given rational number, one starts in the 
upper left corner and works one’s way downward, as the following example shows: 


Deia kOe E2 ak 1 ee a ee SER 
24 24 24/19 1+ 5/19 ae 
19/5 
= 2,—1 2T ee = 2+ l i 
ee pigs ee 1+ ————— 
3+ 4/5 3 + 1 3,—1 
5/4 b+ 


The key steps are the equations 8⁄4 = 2 + !%o4, 247/49 = 1 + %19, 1% = 3 + 1, and 
%4 = 1+ 1⁄4. If we clear fractions in each of these equations we obtain the first four 
of the five equations at the right which show a sequence 


of repeated divisions starting with a given pair of positive 94*= Iq. ae 5 
integers, 67 and 24 in this case. One first divides the 192 i 5 e 4 
smaller number into the larger to obtain a quotient and a 5 i 4 g 1 
remainder which is smaller than the divisor. Then at each j -i i ree 0 


successive step one divides the previous remainder into 
the previous divisor. The process stops when one obtains a remainder of zero. This 
process is known as the Euclidean algorithm. The numbers in the shaded box are the 
quotients of the successive divisions and are sometimes called the partial quotients. 
These are the numbers a, in the continued fraction for 67⁄4. 
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One of the classical uses for the Euclidean algorithm is to find the greatest com- 
mon divisor of two given numbers. If one applies the algorithm to two numbers 
p and q, dividing the smaller into the larger, then the remainder into the first divi- 
sor, and so on, then the greatest common divisor of p and 201 
q turns out to be the last nonzero remainder. For exam- 
ple, starting with p = 72 and q = 201 the calculation is 


= [272 + 57 


shown at the right, and the last nonzero remainder is 3, -i i ae © 
which is the greatest common divisor of 72 and 201. (In _ a ee 


fact the fraction 291/75 equals 67/54, which explains why 
the successive quotients for this example are the same as in the preceding example.) 
It is easy to see from the displayed equations why 3 has to be the greatest common 
divisor of 72 and 201, since from the first equation it follows that any divisor of 72 
and 201 must also divide 57, then the second equation shows it must divide 15, the 
third equation then shows it must divide 12, and the fourth equation shows it must 
divide 3, the last nonzero remainder. Conversely, if a number divides the last nonzero 
remainder 3, then the last equation shows it must also divide 12, and the next-to-last 
equation then shows it must divide 15, and so on until we conclude that it divides all 
the numbers not in the shaded rectangle, including the original two numbers 72 and 
201. The same reasoning applies in general. 


A more obvious way to try to compute the greatest common divisor of two num- 
bers would be to factor each of them into a product of primes, then look to see which 
primes occurred as factors of both, and to what power. But to factor a large number 
into its prime factors is a very laborious and time-consuming process. For example, 
even a large computer would have a hard time factoring a number with a hundred or 
more digits into primes, so it would not be feasible to find the greatest common divi- 
sor of a pair of numbers of this size in this way. However, the computer would have 
no trouble applying the Euclidean algorithm to find their greatest common divisor. 


Having seen what continued fractions are, let us now see what they have to do with 
the Farey diagram. Some examples will illustrate this best, so let us first look at the 
continued fraction for 7⁄6 again. This has 2,3, 2 as its sequence of partial quotients. 
We use these three numbers to build a strip of three large triangles subdivided into 
2, 3, and 2 smaller triangles, from left to right: 


1 L 1 
0 1 2 


cops 
| 


2 2 3 
1 7 


i 
3 5 


We can think of the strip as being formed from three “fans”, where the first fan is 
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made from the first two smaller triangles, the second fan from the next three smaller 
triangles, and the third fan from the last two smaller triangles. Now we begin labeling 
the vertices of this strip. On the left edge we start with the labels 1% and 94. Then 
we use the mediant rule for computing the third label of each triangle in succession 
as we move from left to right in the strip. Thus we insert, in order, the labels 1⁄4, Yo, 
Y3, %s, 37, Yo, and finally 46. 

Was it just an accident that the final label was the fraction 7/;g that we started 
with, or does this always happen? Here is a second example: 


4- — NESNE 
z NAY 


0 1 2 


1 4 7 
Again the final vertex on the right has the same label as the fraction we started with. 

In fact this always works for fractions P/g between 0 and 1. For fractions larger 
than 1 the procedure works if we modify it by replacing the numerator 0 of the label 
°/, with ao, the initial integer in the continued fraction P/g = ag + ine rarab fe 
Thus % is replaced by 40/1. This is illustrated by the 67/54 example: 


67 1 
24 1 


11 25 39 53 67 
4 9 14 19 24 


|r 
nin 
w |oo 


For comparison, here is the corresponding strip for the reciprocal, 24/67: 


2 3 4 9 14 19 24 
5 


Now let us see how all this relates to the Farey diagram. Since the initial edge of 
the strip joining 1/ and 40/ is an edge of the Farey diagram and the rule for labeling 
subsequent vertices along the strip is the mediant rule, each of the triangles in the 
strip is a triangle in the Farey diagram, so the strip of triangles can be regarded as 
a sequence of adjacent triangles in the diagram. Here is what this looks like for the 
fraction ’/;g in the circular Farey diagram, slightly distorted for visual clarity: 
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=|= 


oļ= 


In the strip of triangles for a fraction P/g there is a zigzag path from '/ to P/q 
that we have indicated by the heavily shaded edges. The vertices that this zigzag 
path passes through have a special signif- 


icance: They are the fractions that occur ue = 24+ P ay ae Í 

as the values of successively larger initial > l+ T gi 

portions of the continued fraction, as il- ya e OT > 
, rd 1 + A A 


147, and 67/54 itself. 
From the preceding examples one can 
see that each successive vertex label Pi/g, 
along the zigzag path for a continued fraction ?/g = ag+ aj feet as is computed 
in terms of the two preceding vertex labels according to the following formula: 


se 
lustrated at the right for the earlier exam- SN ia 
ple of 6724. These fractions are called the 11/3 d KA = 
convergents for the given fraction. Thus i a dh 
the convergents for 7/54 are 2, 3, 1%, /5 Eo y 
67/24 y 


Pi _ GiPi-1 * Pi-2 


di = Aidi-) + 4i-2 


This is because the mediant rule is being applied a; times, “adding” Pi-1 laa to the 


previously obtained fraction each time until the next label Pi/g, is obtained. 


Pi—ı 
qi-4 


1; -2 4i 


It is interesting to see what the zigzag paths corresponding to continued fractions 
look like in the upper halfplane Farey diagram. The next figure shows the simple 
example of the continued fraction for 3⁄g. We can see here that the five triangles 
of the strip correspond to the four curvilinear triangles lying directly above 3⁄g in 
the Farey diagram, plus the fifth “triangle” extending upward to infinity, bounded on 
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the left and right by the vertical lines above 94, and 11, and bounded below by the 
semicircle from 94, to Yj. 


L 1 
0 0 


oļ= 
ujn 
oo |w 


i e 
wir 


wr 


o TR 1 3234 1 
1 5 4 3 2 5 3 4 5 
3 


8 
This example is typical of the general case, where the zigzag path for a continued 


fraction P/g = ay + Vai +--+ i, becomes a “pinball path” in the Farey diagram, 
starting down the vertical line from 1% to 40/4, then turning left across a, triangles, 
then right across a, triangles, then left across a; triangles, continuing to alternate 
left and right turns until reaching the final vertex ?/g. Two consequences of this are: 


« The convergents are alternately smaller than and greater than P/g. 
" The triangles that form the strip of triangles for ?/g are exactly the triangles in 
the Farey diagram that lie directly above the point ?/g on the x-axis. 


Here is a general statement describing the relationship between continued frac- 
tions and the Farey diagram that we have observed in all our examples so far: 


Theorem 2.1. The convergents for a continued fraction P/g = ao + 17, +---+ A”, 
are the vertices along a zigzag path consisting of a finite sequence of edges in the 
Farey diagram, starting at Yo and ending at P/,. The path starts along the edge 
from !/ to 40/,, then turns left across a fan of a, triangles, then right across a 
fan of a, triangles, etc., alternating left and right turns and finally ending at P/q. 


Proof: The continued fraction P/g = ao + Yq, +-+- + Yq, determines a strip of 
triangles: 
al ual 
0 1 
V 
ao = Po Ping Pi Pn—ı 
l 40 4i—2 qj an-ı 


We will show that the label Pu/q, on the final vertex in this strip is equal to P/g, 
the value of the continued fraction. Replacing n by i, we conclude that this holds 
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also for each initial seqment ay) + 14, +--+ + 1%, of the continued fraction. This is 
just saying that the vertices Pi/q, along the strip are the convergents to ?/g, which 
is what the theorem claims. 

To prove that Pn/g,, = P/q we will use 2 x 2 matrices. Consider the following 


product: 
pe 1 ao O 1 O 1\  /O 1 
0 1 1 a, 1 a l a, 


We can multiply this product out starting either from the left or from the right. Sup- 
pose first that we multiply starting at the left. The two columns of the first matrix 
give the two fractions 1/4) and %0/, labeling the left edge of the strip of triangles. 
Multiplying the first matrix by the second matrix gives: 


1 ap O 1) (a 1+49a,\ [Po Pi 
0 1 lay \l ay do 4 


The two columns here give the fractions at the ends of the second edge of the zigzag 
path. The same thing happens for subsequent matrix multiplications, as multiplying 
by the next matrix in the product takes the matrix corresponding to one edge of the 
zigzag path to the matrix corresponding to the next edge: 


ee dey f 1 p @ Pi-2 ae z ee Ps) 
Gi-2 G1) \1 a; Gi-1 Gi-2 + Aj 4i-1 Gi-1 4i 
In the end, when all the matrices have been multiplied, we obtain the matrix corre- 
sponding to the last edge in the strip from P?n-1 lasi to Pu/an . Thus the second 
column of the product P is i n) , and what remains to show is that this equals (23 
where ?/ is the value of the continued fraction ay + 1, +--+- + a,- 

The value of the continued fraction a, + Vai tenet VA is computed by working 
from right to left. If we let “i/s; be the value of the tail 14, + Y4,,,+---+ a, of 
the continued fraction, then we have: 


Ui EEE ENE. a aid: Eed esi 
mee z = = = 
Sn An Si qa pL Sin + Yir q sı sı 


Si+1 


Expressed in terms of matrices these equations become: 
COR |p CRE OP 
Sn an l a; Si+1 Yiyi t AiSi+1 Si 
and l ao Yr) _(%+405,\ _(P 
O 1 Sy Sy q 


This means that when we multiply out the product P starting from the right, the 


j j r ee ee Yi j p 
second columns will be successively ({"), (E: : (l and finally (AF We 
have already shown that the second column of P is a , 80 P/qg = Pn/q,, and the 

n 


proof is complete. oO 
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An interesting fact that can be deduced from the preceding proof is that for a 
continued fraction Vai feet Va, with no initial integer ag, if we reverse the order 
of the numbers a;, this leaves the denominator unchanged. For example: 


aAA amd AAA 
To see why this must always be true we use the operation of transposing a matrix to 
interchange its rows and columns. For a 2 x 2 matrix this just amounts to interchang- 
ing the upper-right and lower-left entries, so the transpose of a matrix A = y A is 
AT = (5 5) . Transposing a product of matrices reverses the order of the factors, so 
one has (AB)! = B! A! as the reader can check by direct calculation. In the product 


0 L) 0 a 0 le Prt Aa 
1 a, 1 ay l a, GAn-1 An 


the individual matrices on the left side of the equation are symmetric with respect to 
transposition, so the transpose of the product is obtained by just reversing the order 


OV FO at Je 0 Je Paci ma 
l a, l anai 1 a, Pn dn 


Thus we see that reversing the order of the terms a,,---,a, leaves the denominator 


of the factors: 


4n unchanged, as claimed. 

There is also a fairly simple relationship between the numerators. In the example 
of 13/39 and “/39 we see that the product of the numerators, 91, is congruent to 
1 modulo the denominator. In the general case the product of the numerators is 
Padan- and this is congruent to (—1)"*! 


we note that the determinant of each factor ( 


modulo the denominator q,,. To verify this, 

i i is —1 so since the determinant 
$ 

of a product is the product of the determinants, we have p,,_14n — PnIn_1 = (-1)"; 


ape 


which implies that p 4n- is congruent to ( modulo q,,. 


Exercises 


1. (a) Compute the values of the continued fractions 14, + 14, + 1⁄5 + 44 and 
PAL Le Dae Le 

(b) Compute the continued fraction expansions of 19/44 and 101020. 

(c) Draw the strips of triangles corresponding to the continued fractions in parts (a) 
and (b). 


2. (a) Compute the continued fraction for 3°/g3 and display the steps of the Euclidean 
algorithm for 38 and 83 as a sequence of equations involving only integers. 

(b) For the same number 28/g3 compute the associated strip of triangles (with large tri- 
angles subdivided into fans of smaller triangles), including the labeling of the vertices 
of all the triangles. 
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(c) Take the continued fraction 14, + a, ++ + © + 44%, you got in part (a) and reverse 
the order of the numbers a; to get a continued fraction 1, + YQ, +--+ + Ya; 
Compute the value P/ of this continued fraction, and also compute the strip of tri- 
angles for this fraction P/,. What is the relationship between P/, and 33⁄3? 


3. Let Pn/q,, be the value of the continued fraction VA, + Vas ++ Yas where 
each of the n terms a; is equal to 2. Thus P1/g, = Yo, P2/g, = V2 + 2 = 7⁄5, ete. 
(a) Find equations expressing p, and q, in terms of p,,_, and q,_,, and use these 
to write down the values of Pn/q, for n = 1,2,3,4,5,6,7. 

(b) Compute the strip of triangles for P7/q- . 


4. (a) A rectangle with sides 

of length 13 and 48 can be 

partitioned into squares in 

the way shown in the figure 

at the right. Determine the 

lengths of the sides of all the squares, and relate the numbers of squares of each size 
to the continued fraction for 13/4g. 

(b) Draw the analogous figure decomposing a rectangle of sides 19 and 42 into 
squares, and relate this to the continued fraction for 19⁄42. 


5. This exercise is intended to illustrate the proof of Theorem 2.1 in the concrete case 
of the continued fraction 1⁄ + 1⁄3 + 1⁄4 + 175. 


(a) Write down the product A,A,A3A4 = ($ ae : Ne : VG as) associated to 
iar ha wy Naa ae 

(b) Compute the four matrices A,, A,;A,, A,;A,A3, A; A A3A, and relate these to the 
edges of the zigzag path in the strip of triangles for 1⁄ + 1⁄3 + 1⁄4 + U4. 

(c) Compute the four matrices A4, A3A4, A>A3A4, Ay A2A3Ay4 and relate these to the 
successive fractions that one gets when one computes the value of 1% +17 +174+175, 


namely A AtA 3.7 8+ Lab 7/5 and fot Yat Sat 75s 


6. Compute the strip of triangles corresponding to the continued fraction for “/;9 and 
compare this with the sequence of matrices reducing F D ) to ts d by a sequence of 


operations subtracting one column from the other. (See the proof of Proposition 1.1.) 


7. Show that the continued fraction for a rational number is unique except for re- 
placing a final term 1%, by 14-1 + 1/1 when a,, > 1. For example 1/3 + 175 = 


ha +A. 
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2.2 Infinite Continued Fractions 


We have seen that all rational numbers can be represented as continued fractions 
ao + Ya, t+ a, +o + Va, but what about irrational numbers? It turns out that 
these can be represented as infinite continued fractions ao +14, + a, + Vaz, t+. 


A simple example is 14, + 14, + 1/4, +--+. The corresponding strip of triangles is 
infinite: 
1 | 2 5 13 34 
0 1 3 8 21 55 
o i 3 8 21 55 
1 2 5 13 34 89 


Notice that these fractions after 1% are the successive ratios of the famous Fibonacci 
sequence 0,1,1,2,3,5,8,13,21,--- where each number after the initial 0 and 1 is 
the sum of its two predecessors. The sequence of convergents is thus 9, 4, Y, 
2/3, Y=, Yg, 13,- -, the vertices along the zigzag path. 


The way this zigzag path looks in the 1 
upper halfplane Farey diagram is shown in g 
the figure at the right. After the initial verti- 
cal edge from Y to 9, this path consists of 
an infinite sequence of semicircles, each one 
shorter than the preceding one and sharing 
a common endpoint. The left endpoints of 


1 


the semicircles form an increasing sequence o 
1 1 


a ae 
of numbers which have to be approaching a a oe 
certain limiting value x. We know x has to be finite since it is certainly less than 
each of the right-hand endpoints of the semicircles, the convergents '/j, 2/3, °/g,---. 
Similarly, the right endpoints of the semicircles form a decreasing sequence of num- 
bers approaching a limiting value y greater than each of the left-hand endpoints 
% ,Y%,3/,,---. Obviously x < y. Is it possible that x is not equal to y? If this hap- 
pened, the infinite sequence of semicircles would be approaching the semicircle from 
x to y. Above this semicircle there would then be an infinite number of semicircles, 
all the semicircles in the infinite sequence. Between x and y there would have to be 
a rational number P/g (between any two real numbers there is always a rational num- 
ber), so above this rational number there would be an infinite number of semicircles, 
hence an infinite number of triangles in the Farey diagram. But we know that there 
are only finitely many triangles above any rational number P/g, namely the triangles 
that appear in the strip for the continued fraction for ?/,. This contradiction shows 
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that x has to be equal to y. Thus the sequence of convergents along the edges of 
the infinite strip of triangles converges to a unique real number x. (This is why the 
convergents are called convergents.) 

This argument works for arbitrary infinite continued fractions, so we have shown 
the following general result: 


Proposition 2.2. For every infinite continued fraction ay + VAY + as + Vas fee 
the convergents converge to a unique limit. 


This limit is by definition the value of the infinite continued fraction. There is a 
simple method for computing the value in the example 14, +17 +171 +- - - involving 
Fibonacci numbers. We begin by setting x = 14, +14,+14,4+.---. Then if we take 
the reciprocals of both sides of this equation we get 1/⁄y = 1 + A +A tA t-en. 
The right side of this equation is just 1 + x, so we can easily solve for x: 


We know x is positive, so this rules out the negative root and we are left with the final 
value x = (—1 + v5)/2. The reciprocal 4, = 1+ x = (1+ v5)/2 = 1.618 is known 
as the golden ratio because of its many interesting and beautiful properties. 


Proposition 2.3. Every irrational number has an expression as an infinite continued 
fraction, and this continued fraction is unique. 


Proof: In the upper halfplane Farey diagram consider the vertical line L going upward 
from a given irrational number x on the x-axis. The lower endpoint of L is not a 
vertex of the Farey diagram since x is irrational. Thus as we move downward along L 
we cross a sequence of triangles, entering each triangle by crossing its upper edge and 
leaving the triangle by crossing one of its two lower edges at a point between the two 
endpoints of this edge. When we exit one triangle, we are entering another triangle 
so the sequence of triangles and edges we cross must be infinite. The left and right 
endpoints of the edges in the sequence must be approaching the single point x by 
the argument we gave earlier, so the edges themselves are approaching x. It cannot 
happen that an infinite number of successive edges in the sequence have a common 
vertex since these edges would then be approaching this vertex, which would mean 
that x was rational. Thus the triangles crossed by the line L form an infinite strip 
consisting of an infinite sequence of fans with their pivot vertices on alternate sides 
of the strip. The zigzag path along this strip then gives a continued fraction for x. 
For the uniqueness, we have seen that an infinite continued fraction for x cor- 
responds to a zigzag path in the infinite strip of triangles lying above x. This set 
of triangles is unique so the strip is unique, and there is only one path in this strip 
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that starts at 1/ and then does left and right turns alternately, starting with a left 
turn. The initial turn must be to the left because the first two convergents are a, and 
ao + Va; , with ay +1/q, > ao since a, > 0. After the path traverses the initial edge 
from YY to 40/, no subsequent edge of the path can be in the border of the strip 
since this would entail two successive left turns or two successive right turns. o 


From the preceding arguments we can see fairly explicitly why the triangles in the 
upper halfplane Farey diagram completely cover the upper halfplane, so every point 
(x,y) with y > 0 lies either in the interior of some triangle or on the common edge 
between two triangles. To see this, consider the vertical line L in the upper halfplane 
through the given point (x, y). If x is an integer then (x, y) is on one of the vertical 
edges of the diagram, so we can assume x is not an integer and hence L is not one 
of the vertical edges of the diagram. The line L will then be contained in the strip of 
triangles corresponding to the continued fraction for x. This is a finite strip if x is 
rational and an infinite strip if x is irrational. In either case the point (x,y), being 
in L, will be in one of the triangles of the strip or on an edge separating two triangles 
in the strip. 


Let us consider now how the continued fraction for an irrational number can be 
computed. Recall first how the continued fraction ay + Yy,+ V/a, +---+ V/a, for 
a rational number is computed, as in the example of 624 = 2 + 17 + 1⁄5 + 17 4 174 
earlier in the chapter. We first write 6/24 = 2 + 194 which gives ay = 2, then we 
write 24/;9 = 1+ %9 so a, = 1, then 1%; =3+% so a, = 3, then 3⁄4 = 1 + 1⁄4 
so a3 = 1 and finally 44, = 4 + 0 so a, = 4. This finishes the process and we have 
oa ag Ya, Ht Vag t Yat aga 2 fie fate: 

In summary, the steps are: 

(1) Write the given number x as x = ay + 7, where do is an integer and 0 <7, < 1. 
(2) Write VY, as Y,, =a, +r, where a, is an integer and 0 < r, < 1. 
(3) Write Y, as Y,, = ap +r; where a, is an integer and 0 < r; < 1. 


And so on, repeatedly. 

If x is arational number, the “remainders” 7; are rational numbers with decreas- 
ing denominators until we reach a remainder r,, which is zero and the process stops 
after finitely many steps. We can apply the same procedure if x is irrational, but in 
this case the equations defining the remainders r; show that each successive r; must 
be irrational and in particular nonzero. Thus the process goes on forever, yielding an 
infinite continued fraction. 

One can see this is the continued fraction for x by the following argument. Sup- 
pose the continued fraction for x is ag+ Vas + Yas +--+, We can write this continued 
fraction as dg +% for ri = a + as +---. This y is a number strictly between 
O and 1 since the convergents for 7, all lie between O and 1 and 1, lies between 
any two of its successive convergents. Thus we have x = ay + 7, with0 <7, < 1 
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SO do is the largest integer less than x. Inverting 7, = Vai + wie + +++ gives 
Vy, =a,+ VA, + Tas +-+- -. The preceding argument can now be repeated with W 
in place of x to get Yr, =a +n with n = as + MA; +++. and0<r, <1. Then 


one repeats with 1⁄, in place of 1, , and so on. 

However, there are a couple subtle points in this argument that are somewhat 
hidden by the notation. (These subtle points were also lurking in the background in 
the earlier calculation of the value of 14, +171 +171 +- - -.) First, we defined x and r; 
to be the infinite continued fractions ay + 44, + V/a, +*+- and 14,+14,+--- and 
then said that x = ay +7. For finite continued fractions this is true because they are 
evaluated from right to left, so the last step in evaluating ay + 14,+ 4, +---+ Aa, 
is to add ay to 14, + Va, +--+- + %,. Infinite continued fractions cannot be 
evaluated from right to left since there is no right end to start the evaluation. Instead 
they are evaluated from left to right as the limit of the sequence of convergents. The 
convergents are the values of finite continued fractions, and for these the desired 
result holds so the convergents for ay + a + a + +++ are obtained by adding ay 
to the convergents for 14, + 14, +---. Adding a fixed number ag to each term of 
a convergent sequence of numbers adds ag to the limit of the sequence, so the result 
holds for infinite continued fractions as well as finite continued fractions. 

A similar issue arises when we said that the continued fraction for the reciprocal 
VY, ofr = VA, t+ Va, +: is a+, +> -+ +. Again this is correct for finite contin- 
ued fractions since they are evaluated from right to left, so if one stops the evaluation 
of 14, + Yan +: + a, before the last step of inverting a, + 14,+---+ Va, 


one has the reciprocal of 14, + 44, +-+- + 4,- Thus the convergents for the 
infinite continued fraction 14, + 1/4, +--+ are the reciprocals of the convergents 
for a, + 14, +--+ so the limits of the convergents for the two infinite continued 


fractions will also be reciprocals of each other. 

Here is how the procedure works for computing the continued fraction for v2: 
(1) /2 =1+(/2-1) where ay = 1 since v2 is between 1 and 2. Thus 7, = V2-1. 
(2) VY, = pa Yai tY = /2+1 which is between 2 and 3 so we have 
Notice that something unexpected has happened: The remainder rə = v2-—1 is exactly 
the same as the previous remainder r,. There is then no need to do the calculation 
of W, since we know it will have to be v2 + 1. This means that when we continue 


with step (3), this will be exactly the same as step (2), and the same will be true for 
all subsequent steps. Thus we can immediately write down the continued fraction 


for v2: 
V2 =1+ A+A +A --- 


We can check this calculation by finding the value of the continued fraction in the 
same way that we did earlier for 1⁄4 + 1⁄1 + 1/4 +» - -. It suffices to compute the value 
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of 15 +1 +17 +- -- and then add 1. We set x = 1% + 1% + 1% +- -- and then 
take reciprocals to get Vy = 2 + 1% +1 +1% +.: =2+x. From Vy = 2 +x we get 
the quadratic equation x° + 2x — 1 = 0 with roots x = -1 + V2. Since x is positive 
we can discard the negative root. Thus we have -1 + V2 = A + +A +... 
Adding 1 to both sides of this equation gives the continued fraction for v2. 

We can get good rational approximations to v2 by computing the convergents in 
its continued fraction 1 + 1 + 1 + 1 +- --. Tt is a little easier to compute the 
convergents in 2 + 1% + 1% + 1 +- -= 1 + v2 and then subtract 1 from each of 
these. For 2 + 1% + 1 + 17 + » - - the convergents are: 


2 5 12 29 70 169 408 985 
1 2 5 12 29 70 169 408 
Notice that the sequence of numbers 1,2,5,12,29,70,169,--- is constructed in a 
way somewhat analogous to the Fibonacci sequence, except that each number is twice 
the preceding number plus the number before that. 


(It is easy to see why this has to be true, because each V2 = 1.41421356--- 


convergent is constructed from the previous one by in- 1 = 1.00000000 - - - 
verting the fraction and adding 2.) After subtracting 3/y = 1.50000000- - - 
1 from each of these fractions we get the convergents 7/; = 1.40000000- - - 


to v2, shown at the right. Notice that once an initial 


l a ae 17/2 = 1.41666666 - - - 
string of digits occurs twice in succession, then this 


41 = =n n 
string is unchanged from then on. This is because for /29 = 1.41379310 


any two successive convergents, all subsequent con- 970 = 1.41428571--- 
vergents lie between these two since the convergents 23% 69 = 1.41420118--- 
occur along a zigzag path in the Farey diagram. This 577/408 = 1.41421568 - - - 
is true generally for all infinite continued fractions. 
We can compute the continued fraction for v3 by the same method as for v2, 
but something slightly different happens: 
(1) V3 = 1 + (v3 — 1) with aọ = 1 since V3 is between 1 and 2. Thus 7, = v3- 1. 
(2) Y, = Yp-1= Yg 1 tY = “3+1. This is between 1 and 2 since its 
numerator v3 + 1 is between 2 and 3. Thus a, = 1 and B+, = 1 + (3-1) 
with r, = elz 
3) Yr = Y5 = Ag- BY 4) = /3+1 = 2+(/3-1) with a, = 2 and 
r; = V/3-1. 
Now the remainder r; = v3 — 1 is the same as r, so instead of the same step being 
repeated infinitely often as happened for v2, the same two steps will repeat infinitely 
often. Thus we have computed the continued fraction for v3: 


B =1+A +AA AANA... 
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Checking this takes a little more work than before. We begin by isolating the part of 
the continued fraction that repeats periodically: 


Taking reciprocals, we get: 


1 
eat Fe 7s ee 


Subtracting 1 from both sides gives: 


l 1 1 1 1 1 

—-l= + + + + prar 

= LEEPER Le fi ge 
The next step will be to take reciprocals of both sides, so before doing this we rewrite 
the left side as 1-*/,.. Then taking reciprocals gives: 


2t A Marry 


=2+x 


Thus we have */,_, = 2+x which simplifies to the quadratic equation x? +2x-2 = 0 
with roots x = —1 + v3. Again the negative root is discarded and we get x = -1 + v3, 
so V3 =1+x = 1+ tA +A +A +A + +. which agrees with the 
answer we obtained previously. 


To simplify the notation we will write a bar over a block of terms in a continued 
fraction that repeats infinitely often, for example: 


D107 Bais Ae 


It is true in general that for every positive integer n that is not a square, the 
continued fraction for y/n has the form a) + 14, + a, +++: + a,- The length 
of the period (the repeating block) can be large even for fairly small values of n, for 
example: 


V46 = 6 + 7 +A A A A A A t A A NA +A NA 


This example illustrates two other curious facts about the continued fraction for an 
irrational number yn: 


= The last term of the period (12 in the example above) is always twice the first 
term agọ (the initial 6). 

= Ifthe last term of the period is omitted, the preceding terms in the period form 
a palindrome, reading the same backwards as forwards. 


We will see in Section 4.2 how these two properties follow from certain symmetry 
properties of the infinite strip of triangles in the Farey diagram associated to the 
continued fraction for yn. 
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It is natural to ask exactly which irrational numbers have continued fractions that 
are periodic or at least eventually periodic, like for example: 


LEGA gt 75 LT = GER LEP BOG Tt 738 76 Te 
The answer is given by: 


Lagrange’s Theorem. The irrational numbers whose continued fractions are even- 
tually periodic are exactly the numbers of the form a + byn where a and b are 
rational numbers, b # 0, and n is a positive integer that is not a square. 


These numbers a + byn are called quadratic irrationals because they are roots 
of quadratic equations with integer coefficients. The easier half of the theorem is the 
statement that the value of an eventually periodic infinite continued fraction is always 
a quadratic irrational. This can be proved by showing that the method we used for 
finding a quadratic equation satisfied by an eventually periodic continued fraction 
works in general. Rather than following this purely algebraic approach, however, we 
will develop a more geometric version of the procedure in the next chapter, so we 
will wait until then to give the argument that proves this half of Lagrange’s Theorem, 
in Proposition 3.4. The more difficult half of the theorem is the assertion that the 
continued fraction expansion of every quadratic irrational is eventually periodic. It 
is not at all apparent from the examples of v2 and v3 why this should be true in 
general, but in Chapters 4 and 5 we will develop some theory that will make it clear, 
with the actual proof being given in Proposition 4.1 and Theorem 5.2. Along the way 
we will also develop more efficient methods for computing the continued fraction for 
a quadratic irrational and for computing the value of an eventually periodic infinite 
continued fraction. 

What can be said about the continued fraction expansions of irrational numbers 
that are not quadratic, such as 2, m, or e, the base for natural logarithms? It 
happens that e has a continued fraction whose terms have a very nice pattern, even 
though they are not periodic or eventually periodic: 


E 1 1 1 1 1 1 1 1 1 
EERE eS Ji ae Ae fer a 
Thus the terms are grouped by threes with successive even numbers as middle de- 


nominators. Even simpler are the continued fractions for certain numbers built from 
e that have arithmetic progressions for their denominators: 


-1 

St A Aot Agt 
2 

-1 


The continued fractions for e and (e — 1)/(e + 1) were discovered by Euler in 1737 


while the formula for (e? — 1)/(e? + 1) was found by Lambert in 1766 as a special 
case of a slightly more complicated formula for (e* — 1)/(e* +1). 
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For \/2 and r, however, the continued fractions have no known pattern. For T 
the continued fraction begins: 


Tse a 1s * 71% 72088 


Here the first four convergents are 3, 27/7, 333/106, and 359/;13. We recognize 27/7 as 
the familiar approximation 31/7 to tr. The convergent 3°°/,;3 is a particularly good 
approximation to mt since its decimal expansion begins 3.14159282 whereas m = 
3.14159265 - - -. Itis no accident that the convergent 25°/;;3 obtained by truncating 
the continued fraction just before the 292 term gives a good approximation to Tr 
since it is a general fact that a convergent immediately preceding a large term in the 
continued fraction always gives an especially good approximation. This is because the 
next edge in the zigzag path will be rather small when viewed in the upper halfplane 
Farey diagram since it is the lower edge of a fan with a large number of triangles, and 
the value of the continued fraction lies somewhere between the two ends of this small 
edge. 

There are nice continued fractions for tr if one allows numerators larger than 1, 
as in the following formula discovered by Euler: 


= 1? 3? 5? 7 
WS a a a re 
However, it is the continued fractions with numerator 1 that have the best properties, 
so we will not consider the more general sort in this book. 


Exercises 


1. Compute the values of the following infinite continued fractions: 
(a) 74 

(b) ie for an arbitrary positive integer n 

O FA: and. Tick yo 73 

Yt Hoty ee and a A EAA Ee 
(3 Ys 

2. (a) Compute the continued fractions for /5 and v23. 


(b) Using the continued fraction for v5, find the first convergent which gives a rational 
approximation to v5 accurate to four decimal places. 


3. Compute the continued fractions for vn? + 1 and vn? + n where n is an arbitrary 
positive integer. 
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2.3 Linear Diophantine Equations 


As an application of continued fractions let us see how they can be used to solve 
linear Diophantine equations ax + by = n, where a, b, and n are integers and the 
solutions are required to be integers as well. We can assume neither a nor b is zero, 
otherwise the equation is rather trivial. Changing the signs of x or y if necessary, we 
can rewrite the equation in the form ax — by = n where a and b are both positive. 
Solving this equation means finding multiples of a and b that differ by n. 

If a and b have greatest common divisor d > 1, then since d divides a and b 
it must divide ax — by, so d must divide n if the equation ax — by = n is to have 
any solutions at all. If d does divide n we can divide both sides of the equation by 
d to get a new equation having the same solutions but with the new coefficients a 
and b coprime. For example, the equation 6x — 15y = 21 reduces in this way to the 
equation 2x — 5y = 7. Thus we can assume from now on that a and b are coprime. 
We will show that solutions always exist in this case, in fact infinitely many solutions, 
and we will see how to compute them. 

To find a solution of ax — by = n it suffices to do the case n = 1 since if we 
have a solution of ax — by = 1, we can multiply x and y by n to get a solution of 
ax — by =n. For example, for the equation 2x — 5y = 1 the smallest multiple of 2 
that is one greater than a multiple of 5 is 2-3 > 5-1, so a solution of 2x —5y = 1 is 
(x,y) = (3,1). A solution of 2x — 5y = 7 is then (x,y) = (21,7). 

The idea for solving ax — by = 1 when a and b are coprime is to utilize the 
criterion from Proposition 1.1 that the Farey diagram contains an edge joining %/, 
to Yq exactly when ad — bc = +1. In the case that ad — bc = +1 a solution of 
ax —by = 1 is then (x,y) = (d,c), and when ad—bc = -1 a solution of ax-by =1 
is (x,y) = (-d,-c). 

For a given coprime pair of positive integers a and b we can compute the con- 
tinued fraction for %/, and the corresponding strip of triangles in the Farey diagram 
from Yo to 4%. The last edge in the zigzag path in this strip connects a fraction 
c/q to Y%,, so we have ad — bc = +1. Since Yq is the next to last vertex along the 
zigzag path, the continued fraction for “/g is obtained from the continued fraction 
for %/p by omitting the last term. From this truncated continued fraction we can then 
compute °/g and hence a solution of ax — by = 1. 

As an example, let us solve 67x — 24y = 1. The continued fraction for ®7/54 is 


2+14/4 4144441444. Omitting the last term gives 1 Be 14 
24+144,+14+ A which equals !4/,. Thus we have : : 
67-5 — 24-14 = +1. The sign can be determined by ob- VAM 
serving that 67/54 lies to the right of !4/5 in the circular 


Farey diagram so ®7/y4 < 14⁄5, hence 67:5 < 24-14 and 1 a 24 
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therefore 67-5 — 24-14 = —1. Thus we obtain the solution (x, y) = (—5,-14). 

The fact that 67⁄4 lies to the right of 14/5 in the Farey diagram is a consequence 
of the strip of triangles having an even number of fans. With an odd number of fans 
the situation would be reversed. The number of fans is the number of terms in the 
continued fraction after the initial integer, so we see that it is not really necessary to 
draw the strip of triangles to figure out the correct sign. 

Another way to determine the sign without using the diagram is by computing 
67-5 — 24-14 mod 10 to see whether we get +1 or —1 mod 10. Computing mod 10 
means ignoring all but the last digit, so we get 7-5 — 4-4 = 19 = -1 mod 10 and 
hence the sign is negative. 

We can get other solutions to 67x — 24y = 1 by using other edges of the Farey 
diagram with endpoint 87⁄4 instead of the edge from !4/;. For example we could 
use the edge to 6/54 in the lower border of the strip of triangles. By the mediant 
rule this edge joins °3/;9 to 6724, so we have 67-19 — 24-53 = +1 and this time 
the plus sign is correct, giving the solution (x,y) = (19,53). All the other edges 
connected to 67⁄4 are obtained by repeatedly “adding” 67/54 either to 14⁄5 for edges 
above 67/54, or to °3/;9 for edges below 67⁄24. In the former case these are the edges 
leading to the fractions 14+67k/; , 54, for positive integers k, and in the latter case 
they are the edges to S3+67K/ 9424k for positive integers k. Notice that if we let k be 
negative in one of these formulas, we get the fractions given by the other formula. For 
example in °3+67k/ 9,54. the values k = —1, —2, - - - give the fractions ~!4/_, = 14%, 
-81/_59 = 8l49,--- which are the values of 14+67kA, 54, for k = 0,1,---. This 
means that the general solution of 67x — 24y = 1 is (x,y) = (19 + 24k,53 + 67k) 
for arbitrary integers k. Alternatively, we could write the general solution as (x,y) = 
(—5 — 24k, -14 — 67k) oras (x,y) = (-5 + 24k, -14 + 67k) since k can be replaced 
by —k. 

This example illustrates a general fact: 


Proposition 2.4. For coprime integers a and b, if one solution of ax —by =n 
is (x,y) = (p,q) then the general solution is (x,y) = (p + bk,q + ak) for k an 
arbitrary integer. 


Here we do not need to assume a and b are positive, so by changing the sign of 
b we can write the equation as ax + by = n with general solution (p — bk,q + ak), 
or alternatively as (p + bk,q — ak). 


Proof: One solution (x, y) = (p,q) of ax —by = n is given. For an arbitrary solution 
(x,y) we look at the difference (x — p,y — q) which we denote as (Xo, Yọ). This 
satisfies aX, — by, = 0, or in other words, axo = byo. Since a and b are coprime, 
the equation ax, = by, must have the form a(bk) = b(ak) for some integer k, 
with x) = bk and yọ = ak. Hence every solution of ax — by = n has the form 
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(x,y) = (P + Xo,.4+ Yo) = (p + bk, q+ ak). It is easy to check that these formulas 
for x and y give solutions to ax — by = n for all values of k. o 


The Diophantine equation ax-by = n can be interpreted as a congruence condi- 
tion by rewriting it as ax —n = by which implies that ax = n mod b. Conversely, if 
ax = n mod b then this means that ax-n = by for some integer y,so ax-by =n. 
Thus a solution (x,y) of ax — by = n gives a solution x of ax = n mod b, and 
a solution x of ax = n mod b gives a solution (x,y) of ax — by = n since this 
equation allows y to be computed from a, b, n, and x if b is nonzero. 

The special case ax -by = 1 is equivalent to ax = 1 mod b which says that x is 
a multiplicative inverse to a mod b. We know that ax -by = 1 has a solution exactly 
when a and b are coprime, so this means that a has a multiplicative inverse mod b 
exactly when a is coprime to b. For example the congruence classes mod 15 that 
are coprime to 15 are 1,2,4,7,8,11,13,14 and we can find multiplicative inverses 
for each of these by observing that the products 1-1, 2-8, 4-4, 7-13, 11-11, and 
14-14 are each congruent to 1 mod 15. Thus the numbers 1,4,11, and 14 are their 
own inverses mod 15 while the other inverses occur in pairs, the pair 2,8 and the 
pair 7,13. We could shorten these calculations by noting that if ax = 1 mod b then 
(—a)(—x) = 1 mod b, so for example 2-8 = 1 mod 15 implies (—2)(—8) = 1 mod 15 
or in other words 13-7 = 1 mod 15. Similarly 4-4 = 1 mod 15 implies 11-11 = 1 
mod 15. 

The function which assigns to each positive integer n the number of congruence 
classes mod n of numbers coprime to n is called the Euler phi function (n). Thus 
in the preceding example of multiplicative inverses mod 15 we have œ (15) = 8 from 
the eight numbers 1, 2, 4,7,8,11,13,14. Later in this section we will obtain a formula 
for y(n). 


Linear Diophantine equations with more than two variables can be solved by re- 
duction to the case of two variables. Consider for example a three-variable equation 
ax +by+cz =n. Any number that divides all three coefficients a,b,c must also 
divide n if a solution is to exist, and in this case we can simplify the equation by 
dividing it by the greatest common divisor of a, b, and c, so we may as well assume 
that the greatest common divisor of a, b, and c is 1. 

As an example that is typical of the general case for three variables, consider the 
equation 6x + 10y + 15z = 7. Here the greatest common divisor of 6, 10, and 15 
is 1, although when taken two at a time they have larger common divisors: 2 for 6 
and 10, 3 for 6 and 15, and 5 for 10 and 15. 

The idea for solving 6x + 10y+15z = 7 is to write it first as 2(3x +5y)+15z =7 
and then to rewrite this as the two equations 3x + 5y = w and 2w + 15z = 7. The 
first equation 3x + 5y = w has solutions for every w since 3 and 5 are coprime, 
and we can find the solutions by first solving 3x +5y = 1 and then multiplying these 
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solutions by w. Since the coefficients 3 and 5 are so small, we can find a solution 
of 3x + 5y = 1 by inspection rather than computing continued fractions, and we 
see that (x,y) = (2,—1) is a solution. Then (x,y) = (2w,-w) is a solution of 
3x +5y = w. Applying Proposition 2.4, the general solution of 3x + 5y = w can 
therefore be written as (x, y) = (2w + 5s,—w — 3s) for s an arbitrary integer. 

Next we solve 2w + 15z = 7. A solution of 2w + 15z = 1 is (w,z) = (8,-1) soa 
solution of 2w + 15z = 7 is (w,z) = (56,—7). The general solution of 2w + 15z = 7 
is then (w,z) = (56 + 15t, -7 — 2t) for arbitrary integers t. Alternatively, we could 
notice that 2w + 15z = 7 has the simpler solution (w,z) = (—4,1), obtained either 
by inspection or by letting t = —4 in the pair (56 + 15t, —7 — 2t). Hence the general 
solution of 2w + 15z = 7 can also be written as (w, z) = (—4 + 15t,1 — 2t). 

Using (w,z) = (—4 + 15t,1 — 2t) we now substitute w = —4 + 15t into the 
earlier formula (x, y) = (2w +5s,—w — 3s) to obtain the final answer in terms of the 
arbitrary intgers s and t: 

(x,y,z) = (2(—4 + 15t) + 5s,—(—4 + 15t) — 3s,1 — 2t) 
= (-8 + 5s + 30t,4 — 3s — 15t,1 — 2t) 
In the spirit of Proposition 2.4 we can say that a particular solution of 6x +10y+15z = 
7 is (—8,4,1), obtained by setting s = t = 0, and the general solution is obtained 
by adding this particular solution to (5s + 30t, —3s — 15t,—2t) which is the general 
solution of the associated equation 6x + 10y + 15z = 0 with right side zero. 


The situation for equations with more variables is similar to what happened in 
this example, with an equation in n variables breaking up into n — 1 equations in 
two variables. Each of these has solutions depending on an integer parameter, so the 
solutions of the n-variable equation depend on n — 1 independent parameters. 

We can apply what we have learned about linear Diophantine equations to derive a 
general fact about congruences often referred to as the Chinese Remainder Theorem 
since it was used in ancient Chinese manuscripts to solve mathematical puzzles of a 
certain type. 


Proposition 2.5. A collection of congruence conditions 
x 


a, modm, 


x= ad mod m» 


x =a, mod mą 
always has a simultaneous solution provided that no two of the moduli m, have a 
common divisor greater than 1, and in this case the collection of all solutions forms 
a single congruence class modulo the product m; ---m,. 


Without the hypothesis that the various moduli m; are coprime there may not 
be a common solution. For example the two congruences x = 5 mod 6 and x = 7 
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mod 15 have no common solution since the first congruence implies x = 2 mod 3 
while the second congruence implies x = 1 mod 3. Here we are using the following 
general fact about congruences that will be used often: 


If a congruence a = b holds mod n then it holds mod d for each divisor d of n. 


This is true because if n divides a — b then so does d for each divisor d of n. 


Proof of Proposition 2.5: Let us first prove the existence of a common solution x 
when there are just two congruences x = a} mod m; and x = a» mod m,. In 
this case the desired number x will have the form x = a, + xım; = a + XM» for 
some pair of yet-to-be-determined numbers x, and x,. We can rewrite the equation 
A, + XM]; = az +X M, aS MX) — MX; = A; — A,. We know that this equation has 
a solution (x41, X>) with integers x, and x, whenever m and m, are coprime. This 
is obtained by first finding integers n, and n, such that m,n, + mon, = 1 and then 
multiplying this equation by a», -a to get (ap —a,)M,N, + (Ax -41 )M N; = a2- Ay. 
Then in the equation mıxı — MX = A — a; we may choose x, = (a; — a)n and 
Xo = (Ay — a,)(—N»). Thus we have: 


x =a, +X mı 
= a, + M (ap = a,)n, 


a(l- mn) + amn 


= AMN + aoM N] since 1- mını = MN» 


Summarizing, we have the solution x = a} MN, + a,m,n, where n, and n, satisfy 
MN, +MN, = 1. 

For a system of more than two congruences we may suppose by induction on 
the number of congruences that we have a number x = a Satisfying all but the last 
congruence x = a, mod m,. From the preceding paragraph we know that a number x 
exists satisfying the two congruences x = a mod m,---m,_, and x = a, mod m, 
since m,-:-:m,_,; and m, are coprime. This gives the desired solution to all k 
congruences x = a; mod m; since x =a mod m,:--m,_, implies x = a mod m; 
for each i < k, and a = a; mod m; for each i < k by the inductive hypothesis. 

Now we show that all the different solutions of the given set of congruences form 
a single congruence class mod m; ---m,. If x and y are two solutions then the dif- 
ference x — y is congruent to 0 mod each of the numbers m, +- -, Mmg, which means 
that it is divisible by each m; and hence by their product since they have no common 
factors. Thus x = y mod m; - - - mg, which shows that all the solutions lie in a single 
congruence class mod m; : - -m,. Moreover every number in this congruence class is 
a solution since if x is one solution and y = x mod m,---m, then y = x mod m; 
for each i, so x = a; mod m; implies y = a; mod m,. Oo 


As an illustration of the method in this proof let us find all numbers that are 
congruent to 7 mod 9 and to 8 mod 11. First we find a solution of 9n; + 11n, = 1 
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by the earlier methods. One such solution is (nį, n) = (5,—4). The formula x = 
aMn, +a mn; then gives x = —7-11-4+8-9-5 = —308+ 360 = 52. We are free 
to change this by adding any multiple of 9-11, so the general solution is 52 + 99t for 
arbitrary integers t. If we were to modify the problem by adding a third congruence 
condition such as x = 4 mod 7 then we would just be solving the two congruences 
x = 52 mod 99 and x = 4 mod 7 by the same method. 

There is a geometric picture that gives a way of visualizing what the Chinese 
Remainder Theorem is saying. Consider the case of two simultaneous congruences 
x =a mod m and x = b mod n where m and n are coprime. We can then label 
the mn unit squares in an m xn rectangle by the numbers 1,2,3,- -- starting in the 
lower left corner and continuing upward to the right at a 45 degree angle as shown in 
the following figure for the case of a 9 x 4 rectangle: 


Whenever we run over the top edge, we jump back to the bottom in order to continue, 
and when we reach the right edge, we jump back to the left edge. This amounts 
to taking congruence classes mod m horizontally and mod n vertically. What the 
Chinese Remainder Theorem says is that when m and n are coprime, each unit square 
in the m x n rectangle is labeled exactly once by a number from 1 to mn. (Without 
the coprimeness some squares would have no labels while others would have multiple 
labels.) The figure thus illustrates that specifying a congruence class mod mn is 
equivalent to specifying a pair of congruence classes mod m and mod n via the 
projections onto the two axes. 

For the case of three simultaneous congruences there is an analogous picture with 
a three-dimensional rectangular box partitioned into unit cubes. More generally, for 
k congruences one would be dealing with a k-dimensional box. 


A common situation for applying the Chinese Remainder Theorem is to start 
with a number n factored as n = p} -> DE for distinct primes p,,---,p,, so that 
a congruence x = a mod n is equivalent to a set of k congruences x = a; mod pi. 
If we add the condition that each a; is not divisible by the corresponding prime p; 
then a simultaneous solution x = a for all k congruences must be coprime to n since 
a = a; mod p;' implies a = a; mod p; and we assume a; is nonzero mod p; so 
a is also nonzero mod p;. Conversely, if a is coprime to n and satisfies a set of 
congruences a = a; mod př and hence a = a; mod p;, then a; must be nonzero 
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mod p; since a is. Thus congruence classes mod n of numbers a coprime to n 
are equivalent to congruence classes mod pi of numbers a; coprime to p;, one for 
each i. 

In the geometric picture for the case k = 2 with a rectangular array of unit 
squares, if we require a, to be coprime to p, then we are omitting the numbers 
in certain vertical columns of squares, the columns whose horizontal coordinate is a 
multiple of pı. Similarly, when we require a, to be coprime to p) we omit the num- 
bers in the horizontal rows whose vertical coordinate is a multiple of p». The numbers 
in the boxes that are not omitted are then the numbers coprime to n = p} p? . Here 
is the picture for the case n = 3°-2?: 


28 | 20 | 12 4 32 | 24 16 8 36 
ff ff is] > fs [ar 
bof [oof el © eff 


SEO Oe Ogu 


Here the 12 unshaded squares are what is left after columns 3, 6, and 9 are excluded 
along with rows 2 and 4. In other words we delete multiples of 2 and 3, leaving the 
numbers 1,5, 7,11,13,17,19, 23,25, 29,31,35 as the numbers less than 36 that are 
coprime to 36. 

In the corresponding three-dimensional picture for k = 3 we would be omitting 
the cubes in certain slices parallel to the three coordinate planes, and similarly for 
k>3. 

We can now obtain a formula for the Euler phi function œ (n) which counts the 
number of congruence classes mod n of integers coprime to n. The arguments above 
show that p(n) = p(p;')--- @(p;,*) when n = p}'---p;* for distinct primes p;. 
For a prime p we have o(p”) = p” - p” | = p" \( 
many numbers remain from 1,2,3,---,p’ after we delete p,2p,3p,---,(p’ Dp = 
p”. Thus we have a formula for (n): 


p — 1) since we are counting how 


re-1 


p(n) = py (py -Dp (po — 1) ++ DE (DE - 1) 
e n(#—) (=) y9 (=) 
Pı P2 Pk 
If we omit the factor n from this last product, the remaining product of the terms 
(Pi- Wy, tells what proportion of the numbers less than n are coprime to n. Notice 


that this does not depend on the exponents 7;. For example (36) = 9(4)p(9) = 
2-6 = 12, whichis 1-2/3 = 1/3 times 36, in agreement with the preceding figure. 
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The way that @(n) varies with n is rather erratic since the prime factorizations 
of adjacent numbers are not related. For example we have (1000) = (2°53) = 
2°(2 — 1)5°(5 — 1) = 400, in agreement with the fact that the numbers coprime to 2 
and 5 are the numbers with last digit 1, 3, 7, or 9, which means four out of every ten 
numbers or 400 out of the first 1000 numbers. For the adjacent numbers 999 and 
1001 we have (999) = (33-37) = 18-36 = 648 and p(1001) = (7-11-13) = 
6-10-12 = 720. 


The Chinese Remainder Theorem can be applied to give an example of a Dio- 
phantine equation that has a solution mod n for each positive integer n but does not 
have an actual integer solution. The example is the equation 2x* + 7y* = 1. This 
obviously has no integer solutions, although it does have rational solutions such as 
(x, y) = (3, Y3) and (3⁄5, Y5). We can use either of these rational solutions to get 
a solution mod n for certain values of n in the following way. Let us take the solu- 
tion (3, 1⁄5) for example. This rational solution will give an integer solution mod n 
provided that 5 has a multiplicative inverse “1⁄4” mod n. For example for n = 14 a 
multiplicative inverse for 5 is 3 since 5-3 = 1 mod 14. If we multiply the equation 
2(3/5)* + 7(1%)° = 1 by 5° to get 2-3? + 7-1° = 5° and then multiply by 3°, the 
inverse of 5° mod 14, we get 2-9° + 7-37 = 1 mod 14. 

This argument gives a solution of 2x” + 7y* = 1 mod n whenever 5 has a 
multiplicative inverse mod n. As we saw earlier in this section, this happens whenever 
5 is coprime to n, which means that 5 does not divide n. Similarly, using the other 
rational solution (1/3, 1⁄3) we can solve 2x* + 7y° = 1 mod n whenever 3 does not 
divide n by finding a multiplicative inverse for 3 mod n. 

There remains the possibility that n is divisible by both 3 and 5, and this is where 
the Chinese Remainder Theorem will be used. Consider for example the case n = 30. 
We can factor this as 5-6 where one factor is not divisible by 3 and the other is not 
divisible by 5. By the method above we can obtain a solution of 2x? +7y° = 1 mod 5 
from (1/3, 3) using 3-2 = 1 mod 5 so (1⁄3, 1⁄3) becomes (2,2). For 2x* + 7? = 
1 mod 6 we use (°/, 1⁄5) and the fact that 5-5 = 1 mod 6 so (34,14) becomes 
(3-5,5) = (3,5) mod 6. Thus we want to find (x,y) with (x,y) = (2,2) mod 5 
and (x,y) = (3,5) mod 6. This we do by two applications of the Chinese Remainder 
Theorem, once for x and once for y. We use the earlier formula a; mon, + a,m,n, 
where 5n, + 6n, = 1 so nı = —1 and n, = 1. This yields x = 2-6-1 -3:5:1 = -3 
and y = 2-6-1—5-5-1 = —13. Thus 2(—3)? + 7(—13)* = 1 mod 5 and mod 6. This 
implies the congruence also holds mod 30 since the difference 2(—3)? + 7(-13)*-1 
is divisible by 5 and by 6, hence by 30 since 5 and 6 are coprime. This method 
for the case n = 30 works for any n divisible by 3 and 5 since any such n can be 
factored as n = kl where k is not divisible by 3 and l is not divisible by 5. 

One might ask how rational solutions of 2x* + 7y* = 1 suchas (¥/3,1/3) and 
(34, Y5) can be found. Rational solutions of 2x* + 7y? = 1 are equivalent to integer 
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2 so we are looking for integers x and y such that 


solutions of 2x* + 7y* = z 
2x°+7y" is a square. This is a special case of the general problem of solving quadratic 
Diophantine equations ax* + bxy + cy? = n which will be a central theme of the 


book starting in Chapter 4. 


A Digression on Rational Points on Quadratic Curves 


A key point in the preceding example was the existence of rational solutions of 
2x? + 7y* = 1, which correspond to rational points on the curve 2x7 + 7y° = 1, 
so let us consider now the general problem of determining when a quadratic curve 
ax’ + bxy +cy* = d contains rational points. Here a, b, c, and d are rational 
numbers but there is no loss of generality in assuming they are integers since we can 
multiply the equation by a common denominator for a, b, c, and d if they are not 
all integers. 

The first step is to reduce to the case that b = 0. If a + O we can write: 


2 

ax? +bxy +cy° =a(x + 2y) +(c- y 
Then if we change variables to X = x + bha y and Y = y this converts the equation 
ax? + bxy + cy? =d into the equation aX? + c'Y? = d for c' = c — P”⁄4a. Rational 
values of x and y give rational values for X and Y, and conversely rational values for 
X and Y give rational values for x and y since the change of variables is reversible, 
with x = X—/qgY and y = Y. If a = 0 and c + 0 we can change variables as above 
but with a and c reversed. If both a and c are O the equation is bxy = d which 
always has rational solutions if b + 0. 

Thus it suffices to determine whether curves ax* + by? = c have rational points. 
Again we can multiply through by a common denominator to make a, b, and c in- 
tegers. We assume a, b, and c are nonzero to avoid trivial cases. To have solutions 
we obviously need to assume that a and b do not have one sign and c the opposite 
sign. 

If rational numbers x and y satisfy ax* + by? = c we can put them over a 
common denominator and write them as quotients X/Z and Y/Z forintegers X,Y,Z, 
and then the equation becomes aX* + bY? = cZ? for which we are seeking integer 
solutions (X,Y,Z). With three variables instead of two it may appear that we have 
made the problem more complicated, but an advantage of the new equation is that 
it is homogeneous in the sense that all three terms have the same degree, namely 2. 
This means that if (X,Y,Z) is a solution, then so is (KX,kY,kZ) for any constant k. 
In particular, rational solutions can always be converted to integer solutions. The 
homogeneous equation has the trivial solution (0, 0,0) but this is not very interesting 
so we will always exclude this trivial solution. In fact we will need solutions with Z + 0 
to get actual points (x, y) = (X/Z,Y/Z) on the curve ax? + by* =c. 
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Thus we are asking when an equation ax? + by* = cz? has an integer or ratio- 
nal solution (x,y,z) # (0,0,0). There are a few preliminary simplifications in the 
coefficients a,b,c that can be made. Suppose first that a factors as a'd? for some 
integers a’ and d > 1. The equation can then be written as a' (dx)? + by? = cz’, 
and finding rational solutions of ax? + by* = cz* is equivalent to finding rational 
solutions of a’x? + by? = cz*. Square factors of b and c can be absorbed into y? 
and z* in the same way. Thus there is no loss of generality in assuming that each 
of the coefficients a,b,c in ax? + by? = cz’ is squarefree, that is, has no square 
factors greater than 1. 

If all three coefficients a,b,c have a common prime factor p we can of course 
divide the equation by p to get a simpler equation. Repeating this step, we may 
assume no prime p divides all three coefficients. If p divides two of the coefficients, 
say a = pa and b = pb’, we can still simplify the equation by multiplying it by p 
to get a’ (px)? + b' (py)? = pcz? which can be written as a’x? + b’y? = pez? by 
absorbing p into x and y, and this is a simpler equation in that |abc| has decreased 
by a factor of p. The new equation still has squarefree coefficients since we could 
assume that the divisor p of a and b was not also a divisor of c. By the same 
reasoning we can arrange also that a and c are coprime and b and c are coprime, 
with all three coefficients still squarefree. 

Now we have Legendre’s Theorem as described in Chapter 0: 


Theorem 2.6. An equation ax? + by? = cz? with a, b, and c squarefree coprime 


nonzero integers has an integer solution (x,y,z) + (0,0,0) exactly when the fol- 
lowing conditions are satisfied: ac is a square mod b, bc is a square mod a, —ab 
is a square mod c, and a and b do not both have the opposite sign from c. 


A more symmetric statement could be obtained by changing the sign of c and 
writing the equation as ax? + by? +cz* = 0. Then the conditions would be that -ac 
is a square mod b, —bc is a square mod a, —ab is a square mod c, and the three 
coefficients a,b,c do not all have the same sign. 


Proof: First we show that these congruence conditions must be satisfied if a solution 
exists. Suppose that we have a solution (x,y,z) + (0,0,0) of ax* + by? =cz*. We 
can assume each pair of x,y,z is coprime since for example if a prime p divides 
x and y then p° divides ax? + by? hence it divides cz*, which implies p divides 
z since c is squarefree. Then the solution (x,y,z) could be simplified by dividing 
by p. 

The equation ax? + by? = cz* implies that ax* = cz? mod b. After multiplying 
this congruence by c we get acx* = c*z* mod b. Now, x and b are coprime since 
any prime dividing both would divide ax? + by? = cz? and so would divide c or z, 
neither of which is possible since b and c are coprime and x and z are coprime. Since 
x is coprime to b it has a multiplicative inverse mod b. Multiplying the congruence 
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acx? = c*z* mod b by the square of this inverse, we conclude that ac is a square 
mod b. In the same way we see that bc is a square mod a and —ab is a square 
mod c. 

The converse is considerably harder to prove, so let us first outline what the 
strategy will be. We will use the more symmetric equation ax? + by? + cz? = 0. If 


the left side of this equation could be factored as 
ax? + by? +cz? = (a,x + biy +. €,Z)(aox + boy + cZ) 


with all coefficients integers, then finding a solution of ax? + by? + cz? = 0 would 
be rather easy since we would just have to solve the linear Diophantine equation 
obtained by setting either factor equal to 0. However, factorizations like this rarely 
exist. Instead we will show that the congruence conditions in the theorem guarantee 
that there is a factorization modulo a suitable number n, namely n = abc. What this 
means concretely is that if one multiplies out the product of the two linear factors on 
the right in the displayed equation above, then the coefficients of the x*, y*, and z? 
terms will be congruent to a, b, and c mod n and the coefficients of the xy, yz, 
and xz terms will be 0 mod n. A solution of either congruence a,x + bjy + c;Z = 
0 mod abc, say a,x + by + c,z = 0 mod abc, will then give a solution of the 
congruence ax? + by? + cz? =0 mod abc. 

The next step in the proof will be to show that a solution (x, y,z) of the congru- 
ence a,x +b,y+c,z = 0 mod abc can be chosen so that the value of ax? +by* +cz? 
is a fairly small multiple of abc, in fact either 0 or +abc. The last step in the proof 
will then be a rather subtle trick to convert a solution of ax? + by? + cz? = +abc 
into a solution of ax* + by* + cz? =0. 

Now we begin to fill in details. To factor ax* +by?+cz* mod abc we first factor 
it mod a, b, and c separately. To factor it mod a we just need to factor by? Fiz’ 
mod a. Multiplying by? +cz? by b gives b? y? +bcz?. We are assuming that —bc isa 
square mod a so we have —bc = r° mod a for some integer r. Then b*y* + bcz? = 
b?’ y? —r°z* mod a with b*y* — r2z* factoring as (by + rz)(by — rz). Since b 
is coprime to a it has an inverse b“! mod a so after multiplying the congruence 
b*y* + bez? = (by +rz)(by — rz) mod a by b™! we have the desired factorization 
by? +cz* = (y +b 'rz)(by — rz) mod a. Thus there is a factorization mod a 
of ax* + by? + cz* as a product (a,x + by y + CZ) (aox + boy + Coz) where the 
coefficients a, and a, happen to be 0, but this will not be significant for the rest of 
the argument. 

In the same way there are similar factorizations of ax* + by? + cz? mod b and 
mod c, with possibly different coefficients a4, b4, C1, 42, Do, C2 of the linear factors. 
The Chinese Remainder Theorem, applied once for each of the six coefficients, implies 
that there is a single choice for the coefficients that works mod a, b, and c simulta- 
neously. Since a, b, and c are coprime, the factorization then holds mod abc. 
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We will be interested in triples (x,y,z) of integers satisfying three inequalities 
OX <6 O<y<f O<z<y (x) 


for positive real numbers «, f, and y that are not necessarily integers. To count 
how many triples (x,y,z) satisfy (x) let A(&) be the number of integers x with 
O0 <x < &,so A(X) = & if «& is an integer and A(&) = 1 + |«&] if & is not an integer, 
where | «| is the largest integer less than or equal to œ. Thus A(&) > & if & is not 
an integer. The number of triples (x,y,z) satisfying (x) is then A(«)A(B)A(y). 

If ACœWA(B)A(y) > |abc| there must exist two different triples (x’,y’,z’) and 
(x”,y”,z”) satisfying (x) such that a,x’ + by’ +¢,zZ’ = a,x” + biy” + cz” 
mod abc. The triple (x,y,z) = (x -x",y’-y",z’ —z’’) = (0,0,0) will then satisfy 
a,x + by +c,z = 0 mod abc. The triple (|x|,|v1|,|Z|) will also satisfy (*) so 
x? <a’, y? << Brand z? < y’. 

For the triple (x,y,z) we have ax* + by? + cz° = 0 mod abc from the factor- 


* mod abc. Since a, b, and c do not all have the same 


ization of ax* + by? + cz 
sign, we can assume two are positive and one is negative by multiplying the equation 
by —1 if necessary. After a possible permutation of the coefficients we can assume 


that a > 0, b > 0, and c <0. Since x° < a, y* < B°, and z? < y we then have: 
cy? < cez? < ax’ +by? +cz? < ax? + by? < ac’ + bp* 


If we choose « = y|bc|, B = viac], and y = ab] then these inequalities give the 
inequalities —|abc| < ax’? +by?+cz? < 2|abc|. Since ax*+by?+cz* = 0 mod |abc| 
we must therefore have either ax? + by? + cz? = 0 or ax? + by? + cz? = |abc|. 
The chosen values for a, 6, and y also give «By = |abc| so the earlier hypothesis 
A(X)A(B)A(y) > |abc| becomes A(a)A(B)A(y) > fy which is satisfied unless «, 
f, and y are all integers. Since a, b, and c are coprime and squarefree, «, P, 
and y are all integers only when a, b, and c are +1, but in this case the equation 


ax? + by? + cz? =0 is just x? + y? — z? 


= 0 which has obvious integer solutions. 

All that remains is to deal with the possibility ax* + by? + cz? = |abc|, so 
ax° + by? +cz° = —abc. Rewriting this equation as ax? + by* + c(z* + ab) = 0, 
we would like to convert it into an equation of the form aX? + bY? + cZ? = 0. This 
suggests that we multiply the equation by z° + ab to get a term cZ* = c(z* + ab)’. 
Multiplying ax’ + by?’ by z° + ab, we have: 

(ax? + by*)(z* + ab) = ax°z? + a°bx° + by*z* + ab’ y? 
= a(xz+ by)? +b(yz - ax)" 


Thus we have a solution of aX* + bY? + cZ? = 0, and this is not the trivial solution 
(0,0,0) since Z = z? +ab > 0. o 


To apply Legendre’s Theorem one needs to be able to determine which numbers 
are squares modulo a given number n. The brute force approach is just to com- 
pute all the possible squares. For example for n = 15 the numbers mod 15 are 
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0,+1,+2,+3,+4,+5,+6, and +7 so the squares mod 15 are obtained by squaring 
these to get 0,1,4,9,16 = 1,25 = 10,36 = 6, and 49 = 4. Thus only six of the fifteen 
congruence classes mod 15 are squares mod 15, namely 0,1,4,6,9, and 10. This 


approach becomes tedious for large values of n, but in Section 6.2 we will develop 
more efficient methods for determining whether a number m is a square mod n, 
which turns out to be quite a subtle question. 


Exercises 


1. (a) Find all integer solutions of the equations 40x + 89y = 1 and 40x + 89y =5. 
(b) Find another equation ax + by = 1 with integer coefficients a and b that has an 
integer solution in common with 40x + 89y = 1. Hint: Use the Farey diagram. 


2. Find all integers x satisfying the congruence 31x = 1 mod 71, and then do the 
same for the congruence 31x = 10 mod 71. Are the solutions unique mod 71, i.e., 
unique up to adding multiples of 71? 


3. Find all integer solutions of the equation 9x + 12y + 20z = 4, and do this more 
generally for 9x + 12y + 20z=n. 


4. Find all solutions of the simultaneous congruences x = 6 mod 13 and x = 7 
mod 18. 


5. Show that for the Euler phi function the values @(n) approach infinity as n ap- 
proaches infinity. In other words, show that for each number N > 0 there are only 
finitely many numbers n with p(n) <N. 


6. For each n < 10 determine which numbers are squares mod n by direct calculation. 


7. Determine which curves ax? + by? = c contain rational points for each triple of 
coprime integers a,b,c chosen from the numbers 1,2,3,5. When rational points 
exist, find a specific one. 
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symmetries 
of Farey Diagram 


A notable feature of the various versions of the Farey diagram is their symmetry. 
For the circular Farey diagram the symmetries are the reflections across the horizontal 
and vertical axes and the 180 degree rotation about the center. For the upper half- 
plane Farey diagram there are symmetries that translate the diagram by any integer 
distance to the left or the right, as well as reflections across certain vertical lines, the 
vertical lines through an integer or half-integer point on the x-axis. The Farey diagram 
could also be drawn to have 120 degree rotational symmetry and three reflectional 
symmetries. 


Our purpose in this chapter is to study all possible symmetries of the Farey diagram, 
where we interpret the word “symmetry” in a broader sense than the familiar meaning 
from Euclidean geometry. For our purposes, symmetries will be invertible transfor- 
mations that take vertices to vertices, edges to edges, and triangles to triangles. There 
are simple algebraic formulas for these more general symmetries, and these formulas 
lead to effective means of calculation. An application in this chapter will be to comput- 
ing the values of periodic or eventually periodic continued fractions, and symmetries 
of the diagram will play key roles in later chapters as well. 


3.1 Linear Fractional Transformations 


Our first goal will be to find formulas for all the symmetry transformations of 
the Farey diagram. The formulas will specify where each vertex is sent so they will 
have the form T(*/) = X fh It is easy to write down such formulas for some of 
the simpler symmetries. Reflection of the circular Farey diagram across the vertical 
axis sends a fraction */) to Y/% so itis the transformation T(*/)) = Y/x. Reflection 
across the horizontal axis is T(*/,) = ~*/). Composing these two transformations in 
either order gives a 180 degree rotation of the Farey diagram about its centerpoint, the 
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transformation T (*/y) = —Y/,. For the upper halfplane Farey diagram the horizontal 
translation to the right by n units is T(*/y) = */y + n = *+"¥/,, while a leftward 
translation is T (*/y) = /y =- n =*-"Y/,,, All these formulas work equally well for 
the fraction */) = +1% with the exception of */y +n, where the alternative forms 
X+NY/ and *-"Y/,, are preferable and give T (+) =*Y. 

In these examples the transformations have the form T(*/)) = 4% +BY dy 
for integers a,b,c,d. Another notation is to let z = ¥%/y and then we have: 


rz) =1(%) _axt+by aļ(“/y)+b  az+b 
Ay) cxt+dy  c(*/y)+d cz+d 


A transformation of the type T(*/)) = 4% TOI say or T(z) = 92+/.,44 is called 
a linear fractional transformation since it is defined by a fraction whose numerator 
and denominator are linear functions. Fractions %/y, including +y), correspond to 
pairs (x,y) and from this point of view linear fractional transformations T (*/y) = 
AX+ OV foxes dy correspond to linear transformations T(x, y) = (ax +by,cx +dy). 


In matrix notation this becomes T (3) = E a) o = (ee): 


Linear fractional transformations T(*/)) = 4* + DY foes ay that give symmetries 
of the Farey diagram must take vertices to vertices and edges to edges, so let us see 
what this means for the coefficients a,b,c,d which we will always assume are inte- 
gers. Vertices of the Farey diagram are fractions ?/g in lowest terms, including +l), 
with P/g determining the same vertex as ~P/_g. This ambiguity causes no problem 
for linear fractional transformations T(*/,,) = 4** Yes dy since 4*t OV fox dy = 
~AX-BY/_ ox dy so T(P/q) = T(~P/_q). For T to take vertices to vertices means 
that for a fraction P/g in lowest terms we would like T(P/q) = apio sag to be in 
lowest terms as well. For T to take edges to edges means that if (P/,,"/;) is an edge 
we want (4P+B4/.y.aq ,ar+bs/., as) to be an edge also. In matrix terms this last 
condition is saying that if ec Y) has determinant +1 then Gag q ar tD ) , which is 


b q4 s cp+dq cr+ds 
the product (a A (7 A should also have determinant +1. It is a general fact that 
the determinant of the product of two matrices is the product of the determinants of 
the two matrices. (For 2 x 2 matrices this is easy to check by a direct calculation.) 
ap+bq ar+bs : pr : 
Thus for ( cp+dq cr+ Aa) to have yaaa +1 when ( q i has determinant +1 
the exact condition we need is that ka A should have determinant +1. 


Proposition 3.1. If the matrix E A with integer entries has determinant +1 then 


the associated linear fractional transformation T (*/,,) = 4* +DY oes dy takes ver- 
tices in the Farey diagram to vertices in the diagram and it takes each pair of 
vertices that are joined by an edge to another pair of vertices that are joined by an 
edge. 


It follows that T must take triangles in the diagram to triangles in the diagram 
since triangles correspond to sets of three vertices, any two of which are the endpoints 
of an edge. 
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Proof: We have shown that if (P/,,"/s;) is an edge of the Farey diagram then so is 
(T(P/q), T("/s)) when the matrix of T has determinant +1. This implies that T takes 
vertices to vertices since each vertex P/g is an endpoint of some edge (P/q,"/s), 
so T(P/,) is an endpoint of the edge (T(?/,),T("/s)) and therefore the fraction 
T(P/q) = AP+DA/ ys dq is in lowest terms. o 


We will use the notation LF (Z) for the set of all linear fractional transformations 
T(*/y) = * +BY soup dy With coefficients a,b,c,d in Z such that the matrix (3 3) 


has determinant +1. (Here Z is the set of integers.) 


Changing a matrix t A to its negative -(5 A = o =) produces the same 
linear fractional transformation since -4*-bY/_.y_ dy = 4 HOT sea, dy: This is in 
fact the only way that different matrices with integer entries and determinant +1 can 
give the same linear fractional transformation, by the following argument. The trans- 
formation T(*/y) = AX +BY oy cay takes 1 to % and % to ¥/y so T determines 
each column of CG A up to a sign. Changing the sign of both columns gives the 
same transformation so the only question is whether changing the sign of one col- 
umn could give the same transformation. Changing the sign of the first column has 
the same effect as changing the sign of the second column since changing the sign of 
both columns gives the same transformation. So suppose that we change the sign of 
the second column, changing 4*+9Y/,.4 dy to 4X Ey oe ay- If we apply these two 
transformations to 4, we get 4+4/.,q and 4-¥/._q. These fractions are in lowest 
terms by the previous proposition, so if they give the same vertex of the Farey diagram 
we would have either a+b =a-—b andc+d=c-—d, hence b = 0 and d = 0, or we 
would have a+b = b -a andc+d=d-c, hence a = 0 and c = 0. In either case 
the condition ad — bc = +1 is violated. Thus we see that changing the sign of only 
one column of E A gives a different transformation, finishing the argument. 


If we are given two linear fractional transformations T(*/)) = 4% a T and 
S(X/y) = ex Í. Y/gx+hy then we can compose them to get another linear fractional 
transformation: 

1(s(%/,)) = ZEX + £9) +blgx + hy) _ (ae + bax + (af + bh)y 
c(ex+fy)+d(gx+hy) (ce+dg)x+(cf+dh)y 
The matrix of this transformation is just the product (2 A C f) = Cay ve : 
so composition of linear fractional transformations corresponds to matrix multipli- 
cation. It follows that if T and S arein LF(Z) then so is their composition TS, which 
is also referred to as their product. 

A transformation T in LF(Z) has an inverse T~! in LF(Z) because the inverse 

of a 2 x 2 matrix is given by the formula 


ab\ 1 d -b 
c a ~ ad—bc\-c a 
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so if a,b,c,d are integers with ad — bc = +1 then the inverse matrix also has integer 
entries and determinant +1. When computing the inverse of a transformation in 
LF(Z) the factor '/gg_bc can be ignored since it is +1 and replacing a matrix by its 
negative gives the same linear fractional transformation, as we observed above. 

For a matrix A = e A) the key property of its inverse A~ is that both products 
AA! and A7!A are equal to the identity matrix to 2 , corresponding to the identity 
transformation 1(%/y) = */,. Thus for any transformation T in LF(Z) we have 
TT! = I and TIT = I. The formula TIT = I implies that T gives a one-to- 
one transformation of vertices since if two vertices v, and v, have the same image 
T(v,) = T(v») then we must have T~!(T(v,)) = T~'(T(v>)) so vı = v, and hence 
T cannot send two different vertices to the same vertex, which means it is one-to-one 
as a transformation from vertices to vertices. Also, the formula TT~! = I implies 
that every vertex v, is the image T(v») of some vertex v, since we can write v = 
T(T~'(v,)) and let v, = T~'(v,). The same reasoning applies not just for vertices 
but also for edges and triangles. Thus T can never send two edges to the same edge 
or two triangles to the same triangle, and every edge or triangle is the image of some 
edge or triangle. 


Transformations in LF(Z) can be divided into two types according to whether 
they preserve or reverse the orientations of triangles. A triangle V3 
in the Farey diagram can be oriented either clockwise or coun- “1 
terclockwise by choosing either a clockwise or counterclockwise 
ordering of its three vertices. Thus if the vertices are v4, V5, V3 
then this ordering of the vertices determines one orientation as Va Y2 
in the figures at the right. This is the same orientation as when 
the vertices are ordered V», V3, V] Or V3, V1, V2. The other three V3 
orderings determine the opposite orientation. vı 

A transformation T in LF (Z) takes each triangle to another triangle in a way that 
either preserves the two possible orientations or reverses them: 


= v Tv) T(V,) eo. 
i — T(v,) 
AN B a VA, 
T(v3) 


T(v,) T (v3) 


If a transformation preserves the orientation of one triangle, it has to preserve the 
orientation of the three adjacent triangles, and then of the triangles adjacent to these, 
and so on for all the triangles. Similarly, if the orientation of one triangle is reversed, 
then the orientations of all triangles are reversed. For example, reflection of the cir- 
cular Farey diagram across its horizontal or vertical axis reverses the orientation of 
all triangles, while a 180 degree rotation of the diagram preserves the orientation of 
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all triangles. A translation of the upper halfplane diagram by any number of units 
left or right preserves orientations of triangles while a reflection across a vertical line 
through an integer or half-integer point on the x-axis reverses orientations. 


As we have seen, the matrix (¢ A corresponding to a linear fractional transfor- 
mation 4% + bY dy is unique up to multiplication by —1. The determinant ad — bc 
does not change when each of a,b,c,d is changed to its negative, so each transfor- 
mation in LF(Z) has a well-defined determinant, either +1 or —1. The sign has a 
geometric interpretation: 


Proposition 3.2. An orientation-preserving transformation in LF(Z) has determi- 
nant +1 and an orientation-reversing transformation has determinant —1. 


Proof: Consider a transformation T(*/,,) = 4 +DY foes ay in LF(Z) associated to a 
matrix (¢ 7) . If we multiply one column of the matrix by —1 this changes the sign 
of the determinant. Let us check that it also changes whether T preserves or reverses 
orientation. Changing the sign of one column changes where T takes the triangle 
(1,1,11) from (4/0, 9/4, 4 e+) to (47.074 A-P/ og). These two triangles 
are different as we saw earlier, so they lie on opposite sides of the edge (4/,%/7) 
and hence have opposite orientations. Thus the validity of the proposition for the 
transformation T is unaffected by changing one column of - a] to its negative. 

Applying this fact, we can arrange that c > 0 and d > 0 by multiplying columns 
by —1 if necessary. If c = 0 the condition ad — bc = +1 implies a = +1 and d = 1 
(since d = 0), and then by multiplying the first column by —1 if necessary we can 
arrange that a = 1 so the matrix is iG i . This matrix has determinant +1 and the 
associated transformation sends the triangle (1⁄9, % , 1⁄1) to (%9, 94, P +11) so it pre- 
serves orientation. Similarly, if d = 0 we can assume the matrix is G o) with deter- 
minant —1 and the associated transformation takes (1⁄9, , 11) to (1, ⁄0,4+ 11) 
so it reverses orientation. 

Thus we have reduced to the case that c > 0 and d > 0. The transformation 
T takes the triangle (⁄%,%, 1⁄1) to (4%, 9/4,%*+"/.q) whose third vertex is the 
mediant of the first two. The edge (4/,%/,) lies in either the upper or lower half of 
the circular Farey diagram, and in either case the orientation of (%,°/g,4+¥/.4q) 
given by the ordering of its vertices is the same as the orientation of (1⁄9, % , 11) 
exactly when % > ¥/,. Since c > 0 and d > 0 the inequality % > /, is equivalent 


to ad — bc > 0. Thus T is orientation-preserving exactly when ad — bc = +1. o 


In what follows, when we say that a transformation T in LF(Z) takes a triangle 
(P/q,"/s,"/u) toa triangle (P/q’,”/s', t/u’) we will mean that T (2/4) = P/q’, T (1%) = 
TJs, and T(t/1) = t/u’ so T preserves the order of the vertices. Similarly, when we say 
that T takes an edge (P/7,7/s) to an edge (P/q,7/s') we will mean that T(P/q) = P/q! 
and T('/;) ="/y. 
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Proposition 3.3. (a) For any two triangles (P/q,"/s,"/,) and (P/q’,"/s',"/y") in the 
Farey diagram there is a unique transformation in LF(Z) taking (P/q,'/s,/y) to 
PATIS t/u’) . 

(b) For any two edges (P/q,"/;) and (P/q,*/s') there is a unique orientation- 
preserving transformation in LF(Z) taking (P/q,"/s) to (P/q',"/s') - 

Proof: For a given pair of edges (P/,,"/s) and (BI AA let T, be the transforma- 


f 1 


tion with matrix (2 à and let T, be the transformation with matrix (G vaa so T; 
takes (Y%,%) to (P/g,"/:) and T, takes (Y%,%) to (P/g',”/s'}. The composition 
T = T>T,' then takes (P/q,"/s) to PIET A Hence T takes (P/q,"/s,/,) to either 
(PRTI t’/,’) or the other triangle (Pgh Ts, t") having (P/q,*/s') as an edge. 
For one of these two possibilities T is orientation-preserving and for the other T is 
orientation-reversing. We can change whether T preserves or reverses orientation by 
changing the signs in one column of the matrix of T} or T,. Thus we can arrange that 
T takes (P/q,"/s,/y) to (P/q',"/s',"/u')- 

A transformation in LF(Z) taking (P/4,"/;,%,) to (Pian YJ, t/u} is unique 
since it must take the three triangles sharing an edge with (P/g,"/s, {/,) to the three 
triangles sharing the corresponding edges with (P/q', YJ, t/u), and then this deter- 
mines where the next layer of six triangles sharing an edge with the three triangles 
adjacent to (P/g,"/s, {/,) are sent, and so on until all triangles are accounted for. 

For part (b) we have found a product T>T; ' taking (P/q,"/s) to (P/q', T/s), and if 
this product is orientation-reversing, we can make it orientation-preserving by chang- 
ing the sign of one column of the matrix of T, or T,. An orientation-preserving 
transformation taking (?/g,"/s) to (P/y', YY} is unique since if it preserves orienta- 
tions this determines where it sends the two triangles adjacent to (P/g,"/;) and then 
uniqueness follows from the uniqueness in part (a). o 


In the remainder of this section we will describe five fairly simple types of sym- 
metries of the Farey diagram given by elements of LF(Z). Two other slightly more 
complicated types of symmetries will be described in the next section where they arise 
in connection with continued fractions. 


(1) The diagram can be reflected across any of its edges, leaving this edge fixed and in- 
terchanging the two triangles adjacent to it. This then determines where all the other 
triangles are sent. The simplest case is reflection across (1/),9/,), the transformation 
T(*/,) = ~*/y. To obtain a reflection across an arbitrary edge (7/p),/q), let S be 
the transformation with matrix ($ 4) . The composition STS~' sends (4/,,°/q) first 
to (Y,%) by S~t, then T leaves this edge fixed, then S sends it back to (4/4,°/q). 
Thus STS~! leaves (%,,°/q) fixed so STS”! is either the identity transformation or 
reflection across (4/,,°/q). The transformations S and S7! either both preserve ori- 
entation or both reverse orientation, while T reverses orientation, so STS! reverses 
orientation and is therefore reflection across the edge (4/,,°/q) . Its matrix can easily 
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be computed: 


a c\(-1 0 d -c\ [{-a c d -c \_ {(-ad-bc 2ac 

b d 0 1/\-b a} \-b dj\-b a} \ -2bd ad+bc 
For example, the matrix giving reflection across (Y,,//2) is C J This can be 
checked by noting that its determinant is —1 and it fixes 1⁄4 and . 


(2) The diagram can also be reflected across an 
arc perpendicular to any of its edges, any of the 
dotted arcs in the figure at the right. Each of the 
two triangles this arc crosses is then sent to it- 
self by a reflection that interchanges the two 
vertices at the ends of the given edge and fixes 
the two vertices at the endpoints of the dotted 
arc crossing the edge. A special case is reflec- 
tion across the vertical axis of the circular Farey 
diagram, T(%/y) = Y/<. Reflection across an 
arc perpendicular to an edge (4%/4,,°/qg) can be 
realized as STS”! with S having matrix p 4) as before since STS! then inter- 


changes %/, and $/g and is orientation-reversing. It is not hard to compute the matrix 
of STS! and we leave this as an exercise. 


(3) The diagram can be rotated 180 degrees about the midpoint of any edge, inter- 
changing the two adjacent triangles. This rotation is the composition of the reflection 
across this edge and the reflection across the arc perpendicular to the edge. Rotation 
about the midpoint of (%,%%) is T(*/y) = 7Y/x so rotation about the midpoint of 
an edge (4%, ,°/z) is STS | with the same S as before since STS! interchanges the 
endpoints of (%%,,°/q) and is orientation-preserving. 


(4) The diagram can be rotated by 120 degrees in either direction about the center- 
point of any triangle, the point of intersection of the three dotted arcs that cross the 
triangle in the figure above. In particular this rotates the triangle itself about its cen- 
terpoint. A simple case is the rotation of the triangle (4/%),94,,Y,) by 120 degrees 
counterclockwise. This is given by the transformation T(*/,,) = ¥/y-x with matrix 
k D which has determinant 1 and takes the edge (Y%,%) to (1,11). For ro- 
tation of an arbitrary triangle (44,,7%g,°7/ F) we may assume its vertices have been 
ordered to give it a counterclockwise orientation, so the transformation S with ma- 
trix (g S) takes (%9, %1, 11) to this triangle. Then STS~' rotates (%/p,°/4,°/f) by 
120 degrees counterclockwise since it is orientation-preserving and takes (4%/,,/q) 
to (Yq, ¢). Again the matrix for S TS! could easily be computed. 


(5) The diagram can be pivoted about any vertex v. If the vertices joined to v by 
edges are labeled v; for all integers i, with v; joined to v;,, by an edge, then there 
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is a pivoting transformation T sending each triangle (v,v;,v,,,) to the next trian- 
gle (V,V;,1,Vj,2). The powers T” are then also pivoting transformations sending 
(V,V;,Vie1) tO (V, Vin, Vin+1)} Where n can be any nonzero integer, positive or 
negative. (When n = O one just has the identity transformation sending each vertex 
to itself.) For example, horizontal translation of the upper halfplane Farey diagram by 
any number of units to the right or left amounts to pivoting about the vertex 1/4). The 
transformation T,, pivoting n steps counterclockwise about !/) has matrix (o r 
For an arbitrary vertex %/⁄p, if S is an orientation-preserving transformation taking 
I% to Y%, then S takes the infinite fan of triangles containing Yo to the infinite fan 
containing %/,, so ST,,S' will pivot n steps counterclockwise about 7/,. The dif- 
ferent choices for S have matrices (5 5) with ad — bc = 1, so ST,,S~' has matrix 


a c l n d -c\ [a na+c d -c\ {1l-nab na 
b d}/\0 1)\-b a) \b nb+d)\-b a) \ -nb* 1+4+nab 
where for the last equality we use the fact that ad — bc = 1. Note that c and d do not 
appear in the final answer, reflecting the fact that the pivoting transformation only 


depends on the pivot vertex 7%, and n. For example when 7%, = ~% we get the matrix 


ca A for pivoting n steps counterclockwise about 9/,. 


Exercises 


1. Find a formula for the linear fractional transformation that rotates the triangle 
(VaV OAY yo) 

2. Find the two orientation-reversing linear fractional transformations that take the 
edge (1⁄2, Y3) to itself, possibly interchanging its two ends. 


3. Find a formula for the linear fractional transformation that reflects the upper half- 
plane version of the Farey diagram across the vertical line x = 3⁄. 


4. Compute the matrix of the transformation that reflects the Farey diagram across 
an arc perpendicular to an edge (%,°/q). Do the same for the 180 degree rota- 
tion about the centerpoint of this edge, and for the 120 degree rotation of a triangle 
(p, S/d, Tf) 

5. Express the transformation T(%/y) = ~¥/, in four different ways as a composition 
of three pivoting transformations about fp or %. 


6. (a) Find all the transformations in LF(Z) that fix the vertex Yo, that is, take this 
vertex to itself. 

(b) Find all the transformations in LF(Z) that fix %4. 

(c) Determine which of the transformations in (a) and (b) are reflections and describe 
these reflections. 

(d) Show that if the transformation T fixes */) then STS”! fixes SC7,):- 
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(e) Find all the transformations in LF(Z) that fix 1⁄1. Check that T/y) = Y/x is 
among the transformations you have found. 


3.2 Translations and Glide Reflections 


Linear fractional transformations can be used to compute the values of periodic 
or eventually periodic infinite continued fractions, and to see that these values are al- 
ways quadratic irrational numbers. To illustrate this, consider the periodic continued 
fraction 14, + 14,+ 14, +14. The associated periodic strip in the Farey diagram can 
be extended to give an infinite strip that is periodic in both directions: 


2 ak 4 
0 2 9 

A is. 

1 7 43 


We would like to find a linear fractional transformation that gives the rightward trans- 
lation of this strip that exhibits the periodicity. The only possibility is the transfor- 
mation with matrix E i) since this sends the edge (4%,%) to (4%,1!9%43) and is 
orientation-preserving since the matrix has determinant 1 in view of the inequality 
Y > 19/43. This inequality can be verified either by a calculation or by visualizing 
how the strip lies inside the circular Farey diagram, with the part of the strip to the 
right of the edge (1/,2/,) lying in the upper half of the diagram. 

To see that the transformation T with matrix ic ae) really does translate the 
strip along itself we can argue as follows. Let us label the ten triangles between the 
edges (4,%) and (4%, 1943) as t),t5,--+,t,9 from left to right, and then continue 
this labeling with the subsequent triangles t,,,t,.,--- to the right. We can build the 
part of the strip to the right of the edge (1/,°/,) by starting with this edge and first 
adding the vertex vı just to the right of '/ to form the triangle tį, then adding 
the vertex v, to form t», and so on repeatedly, adding successive vertices v; on 
one border of the strip or the other to form the successive triangles t;. Since T is 
orientation-preserving and takes (Y,9,) to (4%, 19/43) it must take the triangle t, 
to the triangle t,, just to the right of the edge (4% ,19/43), so T takes v, to v,,. The 
triangle t, must then be taken to f,5 so v, is taken to v,,. In the same way we have 
T(t;) = tiio and T(v;) = Vipio for all i > 1 so T translates the right half of the 
strip along itself. For the left half of the strip we can apply similar reasoning to T4: 
Thus T~' sends ty, to the triangle just to the left of (1/,), then it sends tg to the 
second triangle to the left of (%,9/,), and so on. We conclude from all this that T is 
indeed a translation of the strip along itself. 
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The fractions labeling the vertices along the zigzag path in the strip moving to- 
ward the right are the convergents to 1% + 1⁄3 + 171 + 174. Call these convergents 
Z1,Z9,:** and their limit z. When we apply the translation T we are taking each 
convergent to a later convergent in the sequence, so both the sequence {z„} and the 
sequence {T(z,,)} converge to z. On the other hand the sequence {T(z,,)} converges 
to T(z) since this is just saying that 47n*1%7_ 443 converges to 4741%7 443 as Zn 
converges to z. Thus we have T(Z) = Zz. 

In summary, what we have just argued is that the value z of the periodic continued 
fraction 14,4+144+144+% satisfies the equation T(z) = z, which is saying that 
z is a fixed point of the transformation T. Since T(z) = 42+1!9%7443 the equation 
T(z) = z becomes 42+19%,,.43 = z which simplifies to 9z* + 39z—19 = 0. The roots 
of this equation are given by the quadratic formula: 

es -39 + V392+4-9-19  -39+3V13274+4-19  -13+V245_ -13+7V5 
18 18 6 6 
The positive root is the one that the right half of the infinite strip converges to, so we 
have determined the value of the continued fraction: 


1 1 1 1 m -13 + 7V5 

a a 2 er a 
Incidentally, the other root (—13 — 7\/5)/6 has an interpretation in terms of the di- 
agram as well: It is the limit of the numbers labeling the vertices of the zigzag path 


moving off to the left rather than to the right. This follows by the same sort of argu- 
ment as above. 


A periodic continued fraction with period of odd length has an associated infinite 
strip with a different type of symmetry. As an example, consider 14, +14,4+ 17. 
Here the associated strip is: 


L L A 
0 1 10 
2 2 
1 3 


This strip is taken to itself by a transformation that takes (1⁄,%) to (2⁄3, 10) by 
combining a translation along the strip with reflection across the horizontal axis of 
the strip. A transformation of this type is called a glide reflection. The only linear 
fractional transformation that could realize this glide reflection is the transformation 
with matrix (5 i) since this takes (1%,°/) to (*/3, 40) and is orientation-reversing 
as its determinant is —1 since 7/3 < 74g. To check that this transformation gives 
a glide reflection of the strip one can argue as in the preceding example that each 
successive triangle to the right or left of (/,9/,) is moved along the strip in the same 
way that the glide reflection moves it, keeping in mind that orientations are now being 
reversed by both the glide reflection and the linear fractional transformation. The 
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reasoning shows that the translation or glide reflection symmetry of a periodic infinite 
strip in the Farey diagram can always be realized by a linear fractional transformation. 

Just as in the preceding example the value of the continued fraction can be de- 
termined by solving the equation T(z) = z where T is the glide reflection. Thus we 
have 22+7/3,419 = Z which simplifies to 3z° + 8z — 7 = 0 with roots (—4 + /37)/3. 
The positive root gives the value of the continued fraction: 


17 17 17 —4 37 
ae eee 


Continued fractions that are only eventually periodic can be treated in a similar 
fashion. For example, consider 14,+144,+14, +14 + 173. The corresponding infinite 
strip is: 


L = 2 2L 
0 2 7 64 
2 2 38 
1 5 19 


In this case if we discard the triangles corresponding to the initial nonperiodic part of 
the continued fraction, 14 + 14%, and then extend the remaining periodic part in both 
directions, we obtain a periodic strip that is carried to itself by the glide reflection T 
taking (2,75) to (19,24): 


io S. 2 
2 7 64 

2 8 

5 19 


We can compute T as a composition of two transformations realizing the two-step 
combination (/2,7/s) > (%,%) —> (“%19,°%e4). Thus we consider the product 


B27 Vl ID Vy calf Bt OP el Bie SB 
i9 64}/\2 5} ~\19 64/\-2 1) 7\-33 26 


so we have T(z) = ~!42+11/_33,,0¢. This transformation has determinant —1 so itis 
the glide reflection we want. Now we solve T(z) = z. This means ~142+11/_33,,96 = 
z, which reduces to 33z* — 40z + 11 = 0 with roots z = (20 + /37)/33. Both roots 
are positive, and we want the smaller one, (20 — 37) /33, because along the top edge 
of the periodic strip the numbers decrease as we move to the right approaching the 
smaller root and they increase as we move to the left approaching the larger root. 


Thus we have: 20- JF 
Ja la la — 7 
B AAA - 2 


Notice that v37 occurs in both this example and the preceding one where we 
computed the value of 14, +14, + 1⁄3. The explanation for this is that to get from 
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TA +A +17 to 4 A+" + A+ YY one adds 2 and inverts, then adds 2 and 
inverts again, and each of these operations of adding an integer or taking the recip- 
rocal takes place within the set Q(./37) of all numbers of the form a + b./37 with a 
and b rational. More generally, this argument shows that any eventually periodic con- 
tinued fraction whose periodic part is 14, + 1/7, + 1/3 has as its value some number 
in Q(./37). However, not all irrational numbers in Q(./37) have eventually periodic 
continued fractions with periodic part 14, +14, + 1⁄3. For example, the continued 
fraction for \/37 itself is 6+ Ts. with a different periodic part. (This can be checked 
by computing the value of this continued fraction by the method above.) 


The procedure we have used in these examples works in general for any irrational 
number z whose continued fraction is eventually periodic. From the periodic part of 
the continued fraction one constructs a periodic infinite strip in the Farey diagram, 
where the periodicity is given by a transformation T(z) = az+b/ q in LF(Z), with 
T either a translation or a glide reflection of the strip. As we argued in the first 
example, the number z satisfies the equation T(z) = z. This becomes the quadratic 
equation az + b = cz* + dz with integer coefficients, or in more standard form, 
cz? +(d—a)z—b = 0. We would like to apply the quadratic formula to find the roots 
of this equation, but in order to do this the coefficient c must be nonzero. Suppose on 
the contrary that c was zero. Then the determinant condition ad — bc = +1 would 
force a to be +1, and then from the first column of the matrix E A = (5 2) 
we see that T would take the vertex +1% of the Farey diagram to itself. However a 
translation or glide reflection symmetry of a periodic infinite strip cannot take any 
vertex to itself since no vertex along the strip is taken to itself, and the other vertices 
lie in the complement of the strip which consists of disjoint pieces, each containing 
all the vertices lying on one side of an edge in the border of the strip, and a translation 
or glide reflection of the strip takes each of these pieces to a different piece. 

Knowing that c is nonzero, we can apply the quadratic formula to deduce that 
the roots of the equation cz* + (d — a)z — b = 0 have the form A + Byn with A and 
B rational numbers and n an integer. We know that the real number z defined by 
the given continued fraction is a root of the equation so n cannot be negative, and it 
cannot be a square since z is irrational. 

Thus we have an argument that proves one half of Lagrange’s Theorem: 


Proposition 3.4. A real number whose continued fraction is periodic or eventually 
periodic is a quadratic irrational. 


The converse statement that the continued fraction for every quadratic irrational 
is periodic or eventually periodic will be proved in Proposition 4.1 and Theorem 5.2. 
Now let us show how translations and glide reflections can be realized as products 
of simpler transformations. Consider a matrix E A with ad — bc = +1 and all four 


entries a, b,c, d positive integers. There is then a strip in the upper half of the circular 
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Farey diagram connecting the edge (1%,%) to the edge (4%,¥/4). One possible 
configuration for this strip is the following: 


2 
1 


Here the first fan in the strip opens upward and the last fan opens downward, but there 
are three other possibilities depending on whether the first and last fans open upward 
or downward. When 4% > b/y as in the figure, then ad—bc = +1 so the matrix E a) 
defines an orientation-preserving transformation in LF (Z) taking the edge (1⁄9, %1) 
to (4/-,Þ/1). This is a translation of the infinite periodic strip obtained by extending 
the finite strip from (1/,9) to (4%%,/g) periodically in both directions. 

We can move the edge (Y%,%) to (%,/q) by a sequence of pivoting trans- 
formations, one for each fan. One first pivots the edge (1/),9,) across a fan of a, 
triangles to the second edge of the zigzag path, then this edge is pivoted across the 
a» triangles in the second fan to the next edge of the zigzag path, and so on until we 
reach the right edge (4/,/,). These pivotings are alternately in the clockwise and 
counterclockwise direction, and the simplest pivotings of these two types are given 
by matrices E 0) and $ n with n > 0, pivoting n steps clockwise about 9/, or 
counterclockwise about Yo in the two cases. For the configuration of fans shown in 


the figure, let us consider the following product: 


be a) ae) (ae a) od ee, 2) Ge) 


These matrices determine pivoting transformations that alternate between clockwise 
and counterclockwise as they should, with the number of steps being a,,a>,---,a, 
as we want. However there seem to be two things wrong with this product. First, the 
order of the terms appears to be backwards since when we compose transformations 
we proceed from right to left, so this product would first pivot a, steps, then a,_, 
steps, and so on, whereas we want to move the edge (%,9,) across the strip by 
first pivoting a, steps, then a, steps, and so on. The other problem is that each 
pivoting transformation in the product is pivoting about either ~% or o whereas the 
pivotings that move the (1/,°/,) edge across the strip are pivoting about a sequence 
of different vertices. 

Surprisingly enough, these two problems cancel each other out, and the product 
displayed above is actually correct and does equal = ay To see why, suppose we 
superimpose a copy of the strip on top of the circular Farey diagram, but with the right 
edge (4/.,4/,) lying on top of the edge (1/,°/,) and each triangle in the rest of the 
strip lying exactly on top of a corresponding triangle in the lower half of the diagram. 
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If we apply the last matrix of the product to this repositioned strip, this pivots the 
strip so that the next-to-last edge of the zigzag path lies on top of (4%,9). Then 
applying the next-to-last matrix in the product to the newly positioned strip pivots it 
so that the third-to-last edge of the zigzag path lies on top of (4/,°/4,). Continuing 
in this way, we end up with the left edge of the strip lying on top of (1/,°,). This 
means that the product of all the matrices takes the strip back to its original position, 
so the product takes (/,°,) to the right edge of the strip, as we wanted. 

The other three possibilities for whether the first and last fans open upward or 
downward are treated in a similar fashion. For each fan opening upward one uses a 
matrix ( ze A 
downward one uses a matrix e a ) pivoting about Y. 


) giving a pivoting transformation about 9/; and for each fan opening 


As an example consider the matrix ( A a which has determinant 1 and corre- 
sponds to the edge (29,13) with %29 > 4/;3. The corresponding strip in the Farey 
diagram is obtained by computing the continued fraction %9 = 17, + 1⁄4 + 1% as in 
the first figure below: 


1 1 9 


29 
9 


Ae 
13 
From this we can read off that (36 a) = (; À (5 3) G o . Similarly, for e A we 
have 2% = 3 +174 +17; as in the second figure so ey =) = G a (3 n (5 a . Notice 
that in both these cases the first and last fans in the strip open in the same direction, 
so if we extend the strip to an infinite periodic strip, this would produce adjacent fans 
with three and two triangles opening in the same direction, and each of these pairs of 
fans could be combined to give a single fan with five triangles. 

Glide reflection symmetries of infinite periodic strips cannot be expressed as 
products of pivoting transformations since pivotings are orientation-preserving, but 
glide reflections can be expressed as products of very simple glide reflections that, like 
pivotings, move an edge across a single fan but are orientation-reversing. An example 
a) for an integer n > 0. This transformation 
takes the edge (Y%,%) to (1, Yn) and is orientation-reversing, a glide reflection 
symmetry of an infinite strip in which each fan has n triangles. A transformation 


with matrix & 0) has similar behavior, taking (Y%,9/4,) to (%,Y%). 


For example, the matrix Ge sa) of determinant —1 gives a glide reflection tak- 


is the transformation with matrix ( 


ing the left edge of the first strip in the preceding figure to the right edge. This glide 
reflection is a symmetry of the infinite strip obtained by first applying the glide reflec- 
tion to the given strip to get a strip twice as long, then taking the periodic extension 


of this doubled strip in both directions. The corresponding factorization of ‘es a) 
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is E A = G 3) "i a) H À , as one can check by the method we used in the case of 


translations of a periodic strip, placing a copy of the strip on top of the Farey diagram 
with the right edge of the strip on top of the edge (1%,%), but with the copy flipped 
over since we are now dealing with a glide reflection. 

More generally, for any matrix (3 A of positive integers with determinant +1 we 
have an associated strip from the edge (1%, %1) to (4%,4/7), and we can express this 
matrix as a product of the basic matrices F a , G a j (? a) , Or (4 D) , by putting 
arrows on the edges of the zigzag path in the strip to indicate orientations of the edges, 
with the left edge oriented from 1% to % and the right edge oriented from % to ?/g 
and the intermediate edges oriented arbitrarily. These orientations, together with the 
directions that the fans open, determine the factors o w) ; ( z as GF a , OF E a 
in the product representing P A ; 


As an example, in the proof of Theorem 2.1 we made use of the following product: 


F a a 


The corresponding strip is 


oļj= 
aļa 


with the last fan on the right opening either downward as shown or possibly upward, 
depending on whether n is even or odd. The first fan has both its left and right edges 
oriented downward so the first matrix in the product gives the corresponding pivoting 
transformation, but all the other fans have both edges oriented to the right so they 
correspond to glide reflections, the other matrices in the product. If a, = 0 the first 
matrix is the identity matrix (6 i) so it can be omitted, and the first fan is omitted 


as well. 


We have seen seven different types of symmetries of the Farey diagram. The four 
that preserve orientation are rotations about the midpoint of an edge or the center- 
point of a triangle, pivotings about a vertex, and translations of a periodic strip. The 
three that reverse orientation are reflections across an edge or an arc perpendicular 
to an edge, and glide reflections. These seven types along with the identity transfor- 
mation in fact give a complete list of all the types of symmetries of the Farey diagram. 
However, this fact will not be needed in the rest of the book so we will not digress to 
give a proof. 
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Exercises 


1. Compute the value of each of the following continued fractions by first drawing the 
associated infinite strip of triangles, then finding a linear fractional transformation T 
in LF(Z) that gives the periodicity in the strip, then solving T(z) = z. 

a Yer, 

Oo Pye i 

O yt Yate Vie Vit Yi ye 

tetas bys 

CREFF AI a 

(OAL ior 

2. Find an infinite periodic strip of triangles in the Farey diagram such that the trans- 
formation a À is a glide reflection along this strip and h 3) G A = ( : = ) is a 
translation along the strip. 

3. Express the following transformations as compositions of pivot transformations: 
(a) T) = 13X +3 /e9x4 16y 

DTE = EVY 


Quadratic Forms 


Finding Pythagorean triples is answering the question of when the sum of two 
squares is equal to a square. This leads naturally to the broader question of exactly 
which numbers are sums of two squares. Thus one asks, when does an equation 
x? + y? =n have integer solutions, and how can one find these solutions? The brute 
force approach of simply plugging in values for x and y leads to the following list 
of all solutions for n < 50 (apart from interchanging x and y): 

1=1°40°, 2=1°417, 4=27 40°, 5=2° 41%, 8=2%42", 9= 3° +0°, 

10 = 3° + 1°, 13 = 3° + 2°, 16 = 4° +0°, 17 = 4° + 1°, 18 = 3° + 3%, 

20 = 474+ 2°, 25 = 5? + 0° = 4° +3°, 26=5° 41°, 29=5° + 2°, 32 = 4°44", 
34 = 5° + 3°, 36 = 6° + 0°, 37 = 6° + 1°, 40 = 6° + 2°, 41=5° + 4?, 
45 = 6° +3°, 49 = 7° +0°, 50=5°4+5° =7° +1" 

Notice that in some cases there is more than one way to write n as a sum of two 
squares. Our first goal will be to describe a more efficient way to find the integer 
solutions of x° + y? = n and to display them graphically in a way that helps illuminate 
their structure. The technique for doing this will work not just for the function x? +y? 
but also for any function Q(x, y) = ax? + bxy +cy°, where a, b, and c are integer 
constants. Such a function Q(x, y) with at least one of the coefficients a, b,c nonzero 
is called a quadratic form, or sometimes just a form for short. 

Solving x* + y? = n amounts to representing n as the sum of two squares. 
More generally, solving Q(x,y) = n is called representing n by the form Q(x,y). 
So the overall goal is to solve the representation problem: Which numbers n are 
represented by a given form Q(x, y), and how does one find such representations? 
Since every quadratic form Q(x,y) has Q(0,0) = 0, the pair (x,y) = (0,0) is not 
very interesting, so we will always assume implicitly that (x,y) # (0,0), as we did 
for the list of solutions of x° + y? = n above. 

Before starting to describe the method for displaying the values of a quadratic 
form graphically, let us make a preliminary observation: If the greatest common di- 
visor of two integers x and y is d, then we can write x = dx’, y = dy’, and 
Q(x, y) = d*Q(x’, y’) where the greatest common divisor of x’ and y’ is 1. Hence 
it suffices to find the values of Q on primitive pairs (x, y), the pairs whose greatest 
common divisor is 1, and then multiply these values by arbitrary squares d°. 
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In a similar way, if the coefficients a,b,c of a form Q(x, y) = ax? + bxy + cy? 
have greatest common divisor d, so a = da’, b = db’, and c = dc’ for integers 
a',b',c' whose greatest common divisor is 1, then Q(x, y) = dla'x?+b'xy+c' y’) = 
dQ’(x,y) for the form Q’ (x,y) = a' x? +b’xy +c'y*. Multiplying all the values of 
a form by a constant d is a fairly trivial operation, so for most purposes it suffices to 
restrict attention to forms for which the greatest common divisor of the coefficients 
is 1. Such forms are called primitive forms. 

Primitive pairs (x, y) correspond almost exactly to fractions */, in lowest terms, 
the only ambiguity being that both (x,y) and (—x,-y) correspond to the same 
fraction */) = aay y. However, this ambiguity does not affect the value of a quadratic 
form Q(x, y) = ax’ + bxy + cy? since Q(x, y) = Q(-x,-y). This means that we 
can regard Q(x,¥v) as being essentially a function f Fas) Notice that we are not 
excluding the possibilities (x,y) = (1,0) and (x,y) = (—1,0) which correspond to 
the “fractions” 14 and 71%. There will be no need to distinguish between 1% and 
-lf since Q(1,0) = Q(-1,0). 


4.1 The Topograph 


We already have a nice graphical representation of rational numbers */, along 
with +1% as the vertices in the Farey diagram. Here is a picture of the Farey diagram 
with the so-called dual tree superimposed: 


NS vr 


YSNA 


-4/3 -1/1 -3⁄4 
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The dual tree has a vertex in the center of each triangle of the Farey diagram, and it 
has an edge crossing each edge of the Farey diagram. As with the Farey diagram, we 
can only draw a finite part of the dual tree. The actual dual tree has branching that 
repeats infinitely often with smaller and smaller branches. 

The tree divides the interior of the large circle into regions, each of which is 
adjacent to one vertex of the original diagram. We can write the value Q(x, y) in the 
region adjacent to the vertex */,,. This is shown in the figure below for the quadratic 
form Q(x, y) = x?+y° , where to unclutter the picture we no longer draw the triangles 
of the original Farey diagram. 


-3/2 -2/3 
-4/3 -1/1 -3/4 


For example the 13 in the region adjacent to the fraction 7/3 represents the value 
2° + 3°, and the 29 in the region adjacent to °/> represents the value 5° + 2°. 

For a quadratic form Q this picture showing the values Q(x,¥v) is called the 
topograph of Q. It turns out that there is a very simple method for computing 
the topograph from just a very small amount of initial data. This method is based 
on the following arithmetic progression rule: If the values of 
Q(x, y) in the four regions surrounding an edge in the tree are a 
p., q4, rv,and s as indicated in the figure at the right, then the P y 
three numbers p, q4 +r, s form an arithmetic progression. 

We can check this in the topograph of x* + y° shown above. Consider for exam- 
ple one of the edges separating the values 1 and 2. The values in the four regions 
surrounding this edge are 1,1,2,5 and the arithmetic progression is 1,1 + 2,5. For 
an edge separating the values 1 and 5 the arithmetic progression is 2,1 + 5,10. For 
an edge separating the values 5 and 13 the arithmetic progression is 2,5 + 13,34. 
And similarly for all the other edges. 
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The arithmetic progression rule implies that the values of Q in the three regions 
surrounding a single vertex of the tree determine the values in all other regions, by 
starting at the vertex where the three adjacent values are known and working one’s 
way outward in the dual tree. The easiest place to start for a quadratic form Q(x, y) = 
ax? + bxy + cy” is with the three values Q(1,0) = a, Q(0,1) = c, and Q(1,1) = 
a +b +c for the three fractions Yo, 94, and 1⁄4. Here are two examples: 


Q(x,y) = x*+2y? Q(x, y) = x*-2y* 


In the first case we start with the values 1 and 2 together with the 3 just above them. 
These determine the value 9 above the 2 via the arithmetic progression 1, 2 +3, 9. 
Similarly the 6 above the 1 is determined by the arithmetic progression 2, 1 + 3, 
6. Next one can fill in the 19 next to the 9 we just computed, using the arithmetic 
progression 3, 2+ 9, 19, and so on for as long as one likes. 

The procedure for the other form x? — 2y° is just the same, but here there are 
negative as well as positive values. The edges that separate positive values from 
negative values will be important later, so we have indicated these edges by special 
shading. 

Perhaps the most noticeable thing in both the examples x* + 2y* and x? — 2° 
is the fact that the values in the lower half of the topograph are the same as those in 
the upper half. We could have predicted in advance that this would happen because 
Q(x, y) = Q(—x,y) whenever Q(x, y) = ax? + cy’, with no xy term. The topo- 
graph for x° +y? has even more symmetry since the values of x? + y? are unchanged 
when x and y are switched, so the topograph has left-right symmetry as well. 


Given any three integers a, b, and c which are not all zero, there is always a 
quadratic form whose topograph has these three numbers surrounding a vertex since 
the form ax” + (c—a-—b)xy +by* takes the values a, b, and c for (x,y) equal to 
(1,0), (0,1), and (1,1). 
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Now let us prove the arithmetic progression rule. Let the two vertices of the Farey 
diagram corresponding to the values q and r have labels *1 LY and *2 /y5 as in the 
following figure: 


“1/y, 
Xə— X] 
Y2 Yı X FX 
_ XX Sit y?2 
Va Y2 


Safy 


Then by the mediant rule for labeling vertices, the labels on the p and s regions are 


the fractions shown. Note that these labels are correct even when *1/), = 1) and 
*2/y, = 9. For a quadratic form Q(x, y) = ax? + bxy + cy? we then have: 


s = Q(X, + X2, V1, + Vo) = A(X, + X2)? + DUK, +X) (V7, + V2) + 0(, + V9)? 
= ax} + bx; yı + cyt + axb t+ bx2y¥_. + cys + (+++) 
Q(x yı) =4 Q(X, Yə) =Y 


Similarly, we have: 


p = Q(X, — X2, Y1 — Vo) = axe + bx yı +097 + ax + bxy, + cyf — (--) 
i ee 
Q(x1,71) =q Q(Xo, Yo) =Y 


The omitted terms in (---) are the same in both cases, namely the terms involving 
both subscripts 1 and 2. If we compute p + s by adding the two formulas together, 
the terms (- - -) will cancel, leaving just p +s = (q4+r)+(q+r). This equation can be 
rewritten as (4 +r)-p = s —(q +r), which just says that p,q +r,s is an arithmetic 
progression. o 


Exercises 


1. Draw the topograph for the form Q(x, y) = 2x? +5 y’, showing all the values of 
Q(x, y) < 60 in the topograph, with the associated fractional labels */,. If there 
is symmetry in the topograph, you only need to draw one half of the topograph and 
state that the other half is symmetric. 


2. Do the same for the form Q(x, y) = 2x? + xy + 2y°, in this case displaying all 
values Q(x, y) < 40 in the topograph. 


3. Do the same for the form Q(x, y) = x? — y*, showing all the values between +30 
and —30 in the topograph, but omitting the labels */, this time. 
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4. For the form Q(x, y) = 2x? —- xy + 3y° do the following: 

(a) Draw the topograph, showing all the values Q(x, y) < 30 in the topograph, and 
including the labels */5,. 

(b) List all the values Q(x, y) < 30 in order, including the values when the pair (x, y) 
is not primitive. 

(c) Find all the integer solutions of Q(x, y) = 24, both primitive and nonprimitive. 
(And do not forget that quadratic forms always satisfy Q(x, y) = Q(-x,-y).) 


5. Find the quadratic form Q(x, y) for which Q(3,5) = Q(4,7) = Q(7,12) = 1 by 
first drawing a strip in the Farey diagram containing the triangles (4/,94,,) and 
(3/5,4°/7,/,2) (this can be done using the continued fraction for 7⁄2), then adding the 
edges of the dual tree that meet these triangles, then filling in values of the topograph 
starting with the given values. 


4.2 Periodicity 


For most quadratic forms that take on both positive and negative values, such as 
x? — 2y*, there is another way of drawing the topograph that reveals some hidden 
and unexpected properties. Looking back at the topograph we drew for x? -2 y? 
we see a zigzag path of edges separating the positive and negative values, and if we 
straighten this path out to be a line, called the separator line, what we see is the 
following infinitely repeated pattern: 


Q(x, y) = x*-2y? 


14 14 14 14 14 14 14 14 
7 7 7 7 7 7 7 7 
2 1 2 1 2 1 2 1 
=] —2 =] -2 =] -2 =I -2 
-7 -7 —7, -7 -7 -7 -7 -7 
-14 -14 -14 -14 -714 -14 -14 14 
¿17 -17 £17 -17 ¿17 -17 tiy -17 


To construct this, one can first build the separator line starting with the three values 
Q(1,0) = 1, Q(0,1) = —2, and Q(1,1) = —1. Place these as shown in part (a) of the 
figure below, with a horizontal line segment separating the positive from the negative 
values. 
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To extend the separator line one step farther to the right, apply the arithmetic progres- 
sion rule to compute the next value 2 using the arithmetic progression —2,1 — 1,2. 
Since this value 2 is positive, we place it above the horizontal line and insert a vertical 
edge to separate this 2 from the 1 to the left of it, as in (b) of the figure. Now we 
repeat the process with the next arithmetic progression 1,2 — 1,1 and put the new 1 
above the horizontal line with a vertical edge separating it from the preceding 2, as 
shown in (c). At the next step we compute the next value —2 and place it below the 
horizontal line since it is negative, giving (d). One more step produces (e) where we see 
that further repetitions will produce a pattern that repeats periodically as we move to 
the right. The arithmetic progression rule also implies that it repeats periodically to 
the left, so it is periodic in both directions: 


Thus we have the periodic separator line. To get the rest of the topograph we can then 
work our way upward and downward from the separator line, as shown in the original 
figure. As one moves upward from the separator line, the values of Q become larger 
and larger, approaching +o monotonically, and as one moves downward, the values 
approach —co monotonically. The reason for this will become clear in Section 5.1 
when we discuss something called the Monotonicity Property. 

An interesting property of this form x? — 2y° that is evident from its topograph 
is that it takes on the same negative values as positive values. This would have been 
hard to predict from the formula x° — 2y°. Indeed, for the similar-looking quadratic 
form x° — 3° the negative values are quite different from the positive values, as one 
can see in its straightened-out topograph: 


Q(x, y) = x =3y 


221 B7 i29 22) $7 i29 221 37 129 22) 77 i29 
13 13 13 13 13 13 13 13 
6 6 6 6 
1 1 1 1 
=2 =3 =2 -3 =2 -3 =2 -3 
-11 -11 -11 -11 -11 -11 -11 -11 
-23 -23 723 -23 723 -23 —23 -23 
L26 —26 L26 -26 L26 -26 L26 -26 


There is a close connection between the separator line in the topograph of a 
quadratic form x* — dy? and the infinite continued fraction for vd when d is a 
positive integer that is not a square. In fact, we will see that the topograph can be 
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used to compute the continued fraction for vd. As an example let us look at the case 
d = 2. The relevant portion of the topograph for x? — 2y° is the strip along the line 
separating the positive and negative values: 


This is a part of the dual tree of the Farey diagram. If we superimpose the triangles 
of the Farey diagram corresponding to this part of the dual tree, we obtain an infinite 
strip of triangles: 


bie 2 3 10 17 58 99 
0 1 2 7 12 41 70 
o i 4 = 24 41 
1 1 3 5 17 29 


Ignoring the dotted triangles to the left, the infinite strip of triangles corresponds to 
the infinite continued fraction 1 + TA. We saw how to compute the value of this 
continued fraction in Chapter 2, but there is an easier way using the quadratic form 
x? — 2y°. For fractions ¥/y labeling the vertices along the infinite strip, the corre- 
sponding values n = x* — 2y? are either +1 or +2. We can rewrite the equation 
x? -2y =n as AiG = 2+"/,2. As we go farther and farther to the right in the 
infinite strip, both x and y are getting larger and larger while n only varies through 
finitely many values, namely +1 and +2, so the quantity "/)2 is approaching 0. The 
equation A) = 2+"/,2 then implies that Ay) is approaching 2, so ¥/y is 
approaching v2. Since these fractions */y are the convergents for the infinite con- 
tinued fraction 1 + //, that corresponds to the infinite strip, this implies that the 
value of the continued fraction 1+ 1⁄ is vZ. 


As another example, the quadratic form x* — 3y* can be used to compute the 
continued fraction /3 = 1 + 1⁄1 + 1% by the same reasoning: 


1 2 Z 26 
0 1 4 15 
o 1 3 12 19 
1 1 2 3 hi 11 


One can see in these two examples that it is not really necessary to draw the 
strip of triangles, and one can just read off the continued fraction directly from the 
periodic separator line. Let us illustrate this by considering the separator line for the 
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form x° — 10y* shown below: 


4 
0 
1 


If one moves toward the right along the separator line starting at a point in the edge 
separating the 1% region from the 9, region, one first encounters three edges leading 
off to the right (downward), then six edges leading off to the left (upward), then six 
edges leading off to the right, and thereafter six edges leading off to the left and right 
alternately. This means that the continued fraction for v10 is 3 + lA, 


Here is amore complicated example showing how to compute the continued frac- 
tion for /19 from the form x° — 197°: 


From this we read off that VIY = 4 + 1 + 7 4444.44. 


In Section 5.1 we will prove that the topographs of forms x° — dy* always have 
a periodic separator line when d is a positive integer that is not a square. As in 
the examples above, this separator line always includes the edge of the topograph 
separating the 1% and % regions since the form takes the positive value +1 at 1% 
and the negative value -d at 9/,. In addition to being periodic, the separator line also 
has mirror symmetry with respect to reflection across the vertical line through the 1% 
and % regions. This is because the form x° — dy? has no xy term, so replacing 
*/y by ~*/y does not change the value of the form. Replacing */,, by ~*/y reflects 
the circular Farey diagram across the horizontal edge from Yo to 9, and this reflects 
the periodic separator line across the vertical line through the 1% and 9, regions. 
Once the separator line has symmetry with respect to this vertical line, the periodicity 
forces it to have mirror symmetry with respect to an infinite sequence of vertical lines, 


the dotted lines in the figure below for the form x° — 19y?: 
L: : 


1 ao palindrome An = 249 
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These mirror symmetries imply that the continued fraction for vd has the form 


Vd = ao + a, + a, + raiak Aa 
with two further special properties: 
=" Ay = 2a. 
= The intermediate terms a@,,da>,:--,a,_, form a palindrome, reading the same 
forward as backward. 
Thus in V19 = 4 + 1 + 7 +A +A +1 +17% the final 8 is twice the initial 4, 
and the intermediate terms 2, 1,3,1,2 form a palindrome. These special properties 
held also in the earlier examples, but were less apparent because there were fewer 
terms in the repeated part of the continued fraction. 
In some cases there is an additional kind of symmetry along the separator line, 
as illustrated for the form x° — 13y°: 
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As before there is a horizontal translation giving the periodicity and there are mirror 
symmetries across vertical lines, but now there is an extra glide reflection along the 
strip that interchanges the positive and negative values of the form. Performing this 
glide reflection twice in succession gives the translational periodicity. Notice that 
there are also 180 degree rotational symmetries about the points marked with dots 
on the separator line, and these rotations account for the palindromic middle part of 
the continued fraction: 


The fact that the periodic part has odd length corresponds to the separator line having 
the glide reflection symmetry. We could rewrite the continued fraction to have a 
periodic part of even length by doubling the period: 


This corresponds to ignoring the glide reflection and just considering the translational 
periodicity. 


We have been using quadratic forms x* —dy? to compute the continued fractions 
for irrational numbers vd, but everything works just the same for irrational numbers 
J/p/q using the quadratic form qx* — py? in place of x* — dy*. Following the same 
reasoning as before, if the equation qx? — py? = n is rewritten as q(*/y) oes pt+"/y2 
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then we see that as we move out along the periodic separator line the numbers x and 
y approach infinity while n cycles through finitely many values, so the term "/,2 
approaches 0 and the fractions %/ approach a number z satisfying qz? = p, so 
z = ,/p/q. This argument depends of course on the existence of a periodic separator 
line, and we will prove in the next chapter that forms qx*—py* always have a periodic 
separator line if p and q are positive and the roots +,/p/q of qaz? -p = 0 are 
irrational. 


Here are some examples. For the first one we use the form 3x? —- 7y? to compute 
the continued fraction for 7/3: 


This gives J7/3 = 1 + 7 + +A +N + +1. For comparison, we can 


compute the continued fraction for ./3/7 from the topograph of 7x? — 3y°: 


The separator line here is obtained from the previous one by reflecting across a hor- 
izontal axis and changing the sign of the labels. These modifications correspond to 
changing 3x° — 7y* to 3y* — 7x? by first interchanging x and y which reflects the 
Farey diagram and hence also the topograph, and then changing the sign of the re- 
sulting form 3y? — 7x* to get 7x” — 3y? . From the separator line for 7x? — 3y? we 
then read off the continued fraction 1⁄7 + 17 + 17 + 1 + A + 1A +17 for 3/7. 
This is the reciprocal of the previous continued fraction since y3/7 is the reciprocal 


of /7/3. 


For the next example we use 10x* — 29y* to compute the continued fraction for 
4/29/10 from the separator line: 


This gives (29/10 = 1 + 1⁄7 + 1 + 1 + 17 + 17. The period of odd length here 
corresponds to the existence of the glide reflection and 180 degree rotation symme- 
tries. 
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As we see in these examples there are two cases: 


ahs Ee, OSa 
(Pla Vig Van age A, if P< 


The palindrome property and the relation a,, = 2a, that we observed in the continued 
fraction for vd still hold for irrational numbers ./p/q. The key point is that the form 
qx? — py? is unchanged when the sign of x is changed, so its topograph has mirror 
symmetry with respect to reflection across a line through the 1% and %/, regions, and 
this symmetry implies the special properties of the continued fraction. 

One might ask whether the irrational numbers ,/p/q are the only numbers having 
a continued fraction ao + 14, +--+ +4, or Ya, t+ Ya, +--+: + a, satisfying 
the palindrome property and the relation a,, = 2a,). Here we should restrict atten- 
tion only to positive irrational numbers since the numbers dy, a,,---,a, must all be 
positive. The answer is Yes, as we will see later in this section. 


More generally, quadratic forms can be used to compute the continued fractions 
for all quadratic irrationals. To illustrate the general method let us find the continued 
fraction for (10 + V2)/14 which is a root of the equation 14z* — 20z + 7 = 0. The 
associated quadratic form is 14x*-20xy+7y’°, obtained by setting z = */y and then 
multiplying by y’. We would like to find a periodic separator line in the topograph 
of this form. To do this we start with the three values at 1%, %4, and 1⁄4, which are 
the positive numbers 14, 7, and 1, and we then use the arithmetic progression rule 
to move in a direction that leads to negative values since the separator line separates 
positive and negative values of the form. In this way we are led to a separator line 
which is indeed periodic: 


This figure lies in the upper half of the circular Farey diagram where the fractions 
*/y labeling the regions in the topograph are positive. If we follow the separator line 
out to either end, the labels */,, have both x and y increasing monotonically and 
approaching infinity, as a consequence of the mediant rule for labeling vertices of the 
Farey diagram. Hence the values 


14z? — 202 +7 = 14(*/y)* — 20(%/y) + 7 = (14x? = 20xy + 7?) fy”? 


are approaching zero since the values of the numerator 14x” — 20xy + 7y* on the 
right just cycle through a finite set of numbers repeatedly, the values of the form along 
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the separator line, while the denominators y* approach infinity. Thus the labels Yy 
are approaching the roots of the equation 14z° — 20z +7 = 0. Since we are in the 
upper half of the Farey diagram, the smaller of the two roots, which is (10 — v2)/14, 
is the limit toward the right along the separator line and the larger root (10 + /2)/14 
is the limit toward the left. 

To get the continued fraction for the smaller root, we follow the path in the dual 
tree of the topograph that starts with the edge between !/ and 9, then zigzags up 
to the separator line, then goes out this line to the right. If we straighten this path 
out it looks like the following: 


0 


o 
1 
The continued fraction is therefore: 


WMV 
It is not actually necessary to redraw the straightened-out path since in the original 
form of the topograph we can read off the sequence of left and right “side roads” as 
we go along the path, the sequence LRLRLLRR where L denotes a side road to the 
left and R a side road to the right. This sequence determines the continued fraction. 
For the other root (10 + /2)/14 the straightened-out path has the following shape: 


0 


0 
1 


The sequence of side roads is LRRRRLLRR so the continued fraction is 
an + ore ~T 1 lA 
=Nt Lat? 


We will show that this procedure works for all quadratic irrational numbers, and 
this will prove the harder half of Lagrange’s Theorem: 


Proposition 4.1. The continued fraction for every quadratic irrational is eventually 
periodic. 


The proof will involve associating a quadratic form to each quadratic irrational, 
and we will need to use the fact that the quadratic forms arising in this way all have 
periodic separator lines. This will be proved in the next chapter, so the proof will not 
be officially complete until then. 

Before beginning the proof let us examine more closely the structure of all infinite 
strips in the Farey diagram, whether periodic or not. By an infinite strip in the Farey 
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diagram we mean a collection of fans each consisting of a finite number of triangles, 
each fan intersecting the next along an edge of a zigzag path extending infinitely far 


NAN NVANVANVANVANVANVIN 


To see how the strip lies in the upper halfplane model of the Farey diagram let L bea 
line running down the middle of the strip from end to end. Viewing L as a path in the 
upper halfplane model of the Farey diagram, L cannot cross only vertical edges, the 
edges with one end at 1%, otherwise the strip would consist of a single infinite fan, 
which is not allowed as an infinite strip. Thus L must cross some semicircular edges. 
As we move along L crossing such a semicircular edge in the downward direction into 
the adjacent triangle, the next edge that L crosses will be one of the other two shorter 
semicircular edges of this triangle, moving downward again. All subsequent crossings 
will then be downward as well. The semicircles crossed are becoming smaller and 
smaller with diameters approaching zero, as we saw in our initial discussion of infinite 
continued fractions, and there is a unique limiting point « on the x-axis for this end 
of the strip of triangles. This is the unique point that lies between the two endpoints 
of each semicircular edge crossed by L on its downward path. 

Consider the vertical line V, going upward from «. Near its lower end V, will 
pass through triangles of the strip. If the whole line V, does not stay entirely within 
the strip as we move upward, it will eventually leave the strip by crossing the upper 
semicircular edge of a triangle T of the strip as in the figure on the left below. 


In this case the line L, which passes through the same upward sequence of triangles 
as V, until reaching T, must exit T by turning and crossing the other smaller semi- 
circular edge of T in the downward direction. After crossing this edge, L will then 
continue downward forever, passing through all the triangles of the other end of the 
strip and limiting on an irrational number £f. The vertical line Vg going upward from 
p will pass through the same set of triangles until reaching the triangle T where it 
will also exit the strip by crossing the upper edge of T. We can then deform L so 
that it consists of the parts of V, and Vg below T joined by a bending arc within T. 
Notice that the vertex 1% is not a vertex of the strip in this case. 


Section 4.2 — Periodicity 93 


The other possibility is that V, stays in the strip forever as we move upward, so 
eventually it lies in a triangle T, of the strip having Yo as a vertex as in the figure on 
the right above. One end of the line L runs parallel to V, until it reaches T,, then 
it turns right or left to cross a finite number of other triangles having 1% as a vertex 
before turning downward to cross the lower edge of one of these triangles Tg. After 
this it will travel monotonically downward, limiting on an irrational number £ in the 
x-axis. We can deform L to consist of parts of V, and the vertical line Vg through 
B, joined by an arc crossing from T, to Tg. 


One conclusion we can draw from this analysis of the infinite strip is that its 
endpoints « and f} cannot be the same number. This can be seen from the two figures 
above where in the first figure « and £ lie below the two different lower edges of the 
triangle T, and in the second figure « and f lie below the two different triangles Ty 
and Tg with a vertex at 1%. 

Another consequence is that the labels */, on the vertices along the infinite strip 
must have denominators y approaching infinity at the ends of the strip and numer- 
ators x approaching either +œ or —o depending on the sign of the endpoint « or 
p being approached. This is because the labels are given by repeated applications of 
the mediant rule as we move vertically down either end of L toward « or $f so |x| 
and y always increase as each new triangle is added to the strip. (Near the ends of 
the strip the labels */, are approaching « or $ so neither x nor y is 0.) 

We can also deduce that for each pair of distinct irrationals « and £ there is a 
unique infinite strip in the Farey diagram whose ends converge to « and f}. This is 
because « and f determine the vertical lines V, and Vg in the figures, and these 
determine the triangles T or Tą and Tg since in the case that « and £ lie in the same 
interval in the x-axis between consecutive integers, T is the smallest triangle of the 
Farey diagram whose projection to the x-axis contains both « and f, while in the 
case that « and £ lie in different intervals between consecutive integers, the triangles 
T, and Tg are the triangles with vertex lf that project to these two intervals. 


A nice way to construct an infinite strip joining any two irrationals « and B is 
to take all the triangles in the Farey diagram that meet the semicircle in the upper 
halfplane with endpoints « and $. This semicircle can cross an edge of the Farey 
diagram only once since if two semicircles in the upper halfplane with endpoints on 
the x-axis intersect in more than one point, then they must coincide. Nor can two 
semicircles with endpoints on the x-axis be tangent unless the point of tangency is 
one of the endpoints, but this does not happen here since « and £f are irrational while 
the endpoints of edges of the Farey diagram are rational. From these observations we 
see that if the semicircle from « to f intersects a triangle of the Farey diagram, then 
it crosses this triangle from one edge to another edge. The semicircle cannot cross an 
infinite number of triangles having a common vertex, otherwise the semicircle would 
contain points arbitrarily close to the common vertex, which is impossible since the 
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common vertex cannot be either of the irrational numbers « and f. Thus the union 
of all the triangles crossed by the semicircle is an infinite strip. 

We have seen that an infinite strip is uniquely determined by its endpoints, so 
this implies that the semicircle from « to B crosses exactly the same triangles as the 
line we constructed earlier consisting of two vertical segments joined at the top by 
a 180 degree bend. This may seem odd at first glance, but what it means is that the 
height of the vertical segments cannot be too large compared to the distance between 
them. 

The construction of a strip connecting two irrational numbers « and f via the 
semicircle with endpoints « and fp works equally well when « or Bf is rational, but 
in this case the strip has only a finite number of triangles at a rational end. A very 
special case is when « and $ are the endpoints of an edge of the Farey diagram, when 
the strip degenerates to just this edge. 


Proof of Proposition 4.1: Quadratic irrationals are the numbers « = A + Byn for 
which A and B are rational, B is nonzero, and n is a positive integer that is not a 
square. The first step in the proof will be to find a quadratic equation with integer 
coefficients having « as aroot. From the quadratic formula we know the other root will 
have to be the conjugate & = A—B,/n, with Q + « since B + 0. A quadratic equation 
having « and & as roots is then (z — «)(z — &) = 0. Multiplied out, this becomes 
Z°— (K+ R)Z+ 8 = z? —2Az + (A° —B?°?n) = 0 which has rational coefficients since A 
and B are rational. After multiplying by a common denominator for the coefficients, 
this becomes an equation az” + bz + c = 0 with integer coefficients having « and & 
as roots. Here a > 0 since it is the common denominator we multiplied by. 

The polynomial az* + bz +c determines a quadratic form ax? + bx yte y? . This 
form has two special properties: 


= Its topograph contains both positive and negative values. This is because the 
polynomial az*+bz+c = a(z—a)(z—@) takes negative values when z is between 
the two roots « and &, where the two factors in parentheses have opposite sign, 
and positive values when z is greater than both roots or less than both roots, so 
the two parenthetical factors have the same sign. Thus there are rational numbers 
Z = */y where the left side of the equation 


a(*/y)° +b(*X/y) +c = (ax? + bxy +cy’)/y? 
has both signs, hence the same is true for the numerator on the right. 

= The topograph does not contain the value 0. To see why, suppose there is a pair 
(x,y) # (0,0) with ax? + bxy +cy° = 0. We cannot have y = 0, otherwise x 
would also be 0 since a + 0. Then since y + 0, the displayed equation above 
would say that %/y was a rational root of az? + bz +c = 0, contradicting the fact 
that its roots « and & are irrational. 
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We will show in Theorem 5.2 that every form ax? + bxy + cy* satisfying these two 
properties has a periodic separator line in its topograph. This corresponds to an 
infinite periodic strip in the Farey diagram. We claim that the ends of this strip must 
be at the roots « and & of the equation az* + bz +c = 0. To see why this is true, 
consider the labels */,, on the vertices along this strip. Since the denominators y 
approach infinity as we go out to either end of the periodic infinite strip while the 
values of the form ax’? + bxy + cy? cycle through finitely many values, it follows 
that the values of the right side of the equation 


a(*/y)° +b(¥/y) +c = (ax? + bxy +cy*)/y? 

are approaching zero. This means that the vertex labels */,, are approaching a root 
of the equation az* + bz +c = 0. We saw earlier that the two ends of the strip are at 
two different irrational numbers, so the two ends of the strip are at the two roots « 
and @, as claimed. 

To get the continued fraction for the given root « we take the strip consisting 
of all the triangles in the upper halfplane Farey diagram that meet the vertical line 
through «. This strip will start at the vertex fo at the top and then move downward 
through an infinite sequence of triangles that eventually coincide with the triangles 
in one end of the infinite periodic strip. This means that the continued fraction for « 
is eventually periodic. o 


Now we can answer a question raised earlier in this section: 


Proposition 4.2. The numbers ./p/q are the only quadratic irrationals having con- 
tinued fractions ao + q + 000 + Va, Or Vagt Va + o + a, satisfying the 
palindrome property and the relation a,, = 2ag. 


Proof: Suppose the continued fraction for a quadratic irrational « satisfies these 
conditions. In particular & must be positive since ay is positive, being half the positive 
number a,,. The strip in the Farey diagram corresponding to this continued fraction 
starts at the (1/,°/,) edge and goes out to « at its end. Combining this strip with its 
reflection across the (Y,9/,) edge gives an infinite strip with mirror symmetry across 
the (Y,%) edge, and this strip is periodic everywhere, even at the junction along 
the (1/,%) edge since a,, = 2Aa,. The other end of this strip is œ since we have seen 
that the two endpoints of a periodic strip satisfy a single quadratic equation T(z) = z 
where T is the periodicity transformation. The two roots « and & of this equation are 
conjugates and they are also negatives of each other by the mirror symmetry across 
the edge (Y%,°), so we have W = —a. Writing & as A+ Byn with A and B rational, 
the equation & = —« implies that A = 0. Since « is positive we then have « = Byn 
with B > 0. Thus « is the square root of the positive rational number B7n. o 
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The next result answers another natural question one might ask: 


Proposition 4.3. Every periodic line in the dual tree of the Farey diagram occurs 
as the separator line for some form. 


Proof: Given a periodic line, the periodicity of this line and of the corresponding 
infinite strip is realized by some linear fractional transformation T. As we have seen, 
the endpoints of the strip are the fixed points of T, the solutions of T(z) = z. This 
can be rewritten as a quadratic equation az? + bz + c = 0 with integer coefficients. 
The coefficient a must be nonzero, otherwise we would have an equation bz + c = 0 
with only one root if b + 0, while if b = 0 the equation would have no roots if c + 0. 
If c = 0 as well as a = 0 and b = 0 the equation would degenerate to 0 = 0, meaning 
that every z satisfied T(z) = z so T would be the identity transformation rather 
than the periodicity transformation, a contradiction. Thus a must be nonzero, and 
we may assume that a > 0 by multiplying the equation by —1 if necessary. 

We claim that the the periodic line we started with is a separator line in the topo- 
graph of the form ax? + bxy + cy’. This just means that the values of the form at 
vertices along one edge of the associated periodic strip are all positive and the values 
along the other edge are all negative. To see why this is so let us factor az* + bz +c 
as a(z- &)(z- 7) where « and X are the roots of az? +bz +c = 0 at the ends of the 
strip. From this factorization and the fact that a is positive we see that the product 
a(z — X)(z — Q) is negative if z is between « and & and positive if z is greater than 
both « and & or less than both « and &. (We saw this previously in the proof of 
Proposition 4.1.) Taking z to be a rational number */,,, the equation 


a(*/y)° + b(X/y) +c = (ax? + bxy +cy*)/y* 

implies that the form ax? + bxy + cy* takes negative values for */y in the interval 
between « and & and positive values for */, outside this interval, assuming */, + W 
so we are not dividing by 0 in the equation above. 

In terms of the circular Farey diagram the roots « and & divide the boundary cir- 
cle into two arcs, with the form taking positive values at vertices in one arc and nega- 
tive values at vertices in the other arc, with the possible Š 
exception of the vertex o. However, this vertex is not 
actually exceptional since it lies in the arc with posi- 
tive values and the form takes the value a > 0 when + 


RIO 


Xy = W. This proves what we wanted since vertices 
along one edge of the strip lie in one arc and vertices 
along the other edge lie in the other arc. oO Q 
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To illustrate the procedure in the preceding proof let us find a quadratic form 
whose periodic separator line is the following: 


i 1 T 25 
0 1 


o 2 Bi 84 
1 3 13 121 


The fractional labels correspond to vertices of the underlying Farey diagram, and 


from these we see that the translation giving the periodicity sends 1% to 25/36 and 9, 
25 84 
36 121 
T(z) = 292+84/,.,.10,. The fixed points of T are determined by setting this equal 


to 84/,5,. The matrix of this transformation is ( ) so it is the transformation 
to z. The resulting equation simplifies to 36z°+96z-84 = 0 or just 3z°+8z-7=0. 
The roots « and & of this equation az? + bz + c = 0 are the fixed points, but we do 
not actually have to compute them since we know the quadratic form we want is then 
ax? +bxy +cy* which in this example is just 3x? + 8xy —7y°. As a check, we can 


compute the separator line of this form: 


1 
0 
3 


This provides a realization of the given periodic line as the separator line of a hyper- 
bolic form. Any constant multiple of this form would also have the same separator line 
since we would just be multiplying all the labels along the line by the same constant. 

We could have simplified the calculation in this example by observing that the 
periodic line we started with is taken to itself by a glide reflection that moves the line 
only half as far along itself as the translation T that we used. This glide reflection is 
T(z) = 22+7/35,19 and it has the same fixed points as T so we could use the equation 
T (z) = z instead of T(z) = z. Thus we have 22+/3,,19 = z which simplifies more 
directly to 3z* + 8z — 7 = 0, the same final equation as before. 


Exercises 


1. Determine the periodic separator line in the topograph for each of the following 
quadratic forms. (You do not need to include the fractional labels */5,.) 


(a) x? -= 79° 
(b) 3x? —4y7 


(c) x+ xy- y? 
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2. For the following quadratic forms, draw enough of the topograph, starting with 
the edge separating the 1/4) and % regions, to locate the periodic separator line, and 
include the separator line itself in your topograph. 

(a) x? +3xyt+y? 

(b) 6x? + 18xy + 139° 

(c) 37x? —104xy + 73y" 


3. Using your answers in the first problem above, write down the continued fraction 
expansions for v7, 2./3/3, and (-1+ v5)/2. 


4. Use a quadratic form to compute continued fractions for the following pairs of 
numbers: 

(a) (3 + V6)/2 and (3 - V6)/2 

(b) (11 + V13)/6 and (11 — V13)/6 

(© (14+ V7)/9 and (14 - V7)/9 


5. Compute the periodic separator line for the form x? — 43y? and use this to find 
the continued fraction for v43. 


6. Use the form x° — 2n° y? to compute the continued fraction for ny2 for n = 
1,2,3,4,5. 


7. Compute the continued fraction for /21 using the form x*—21y7. Can you explain 
the relationship between this continued fraction and the one for 7/3 computed in 
this section? 


8. (a) Find a quadratic form whose periodic separator line has the following pattern: 


(b) Generalize part (a) by replacing each pair of upward edges with m upward edges 
and each triple of downward edges with n downward edges. 


4.3 Pell’s Equation 


We encountered the equation x? — dy? = 1 briefly in Chapter 0. It is traditionally 
called Pell’s equation, and the similar equation x° — dy? = —1 is sometimes called 
Pell’s equation as well, or else the negative Pell’s equation. If d is a square then the 
equations are not very interesting since in this case d can be incorporated into the y? 
term, so one is looking at the equations x° — y* = 1 and x* — y* = —1, which have 
only the trivial solutions (x,y) = (+1,0) for the first equation and (x,y) = (0, +1) 


for the second equation since these are the only cases when the difference between 
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two squares is +1. We will therefore assume that d is not a square in what follows. 
It will suffice to find the solutions with x and y positive since the signs of x and y 
do not affect the value of x° — dy’. 

As an example let us look at the equation x° — 19y* = 1. We drew a portion of 
the periodic separator line for the form x° — 19? earlier, and here it is again with 
some of the fractional labels */;, shown as well: 


i 9 48 170 
(0) 2 11 39 
1 6/5 19;/9] 516 1 6 


Ignoring the label “4!/, 79 for the moment, the other fractional labels are the first few 
convergents for the continued fraction for /19 that we computed before, which is 
4414+ NA +AA +A +17. These fractional labels are the labels on the 
vertices of the zigzag path in the infinite strip of triangles in the Farey diagram, which 
we can imagine being superimposed on the separator line in the figure. The fractional 
label we are most interested in is the 179/39 in the upper right because this is the label 
on a region where the value of the form x° — 19y? is 1. This means exactly that 
(x,y) = (170,39) is a solution of x? — 19y* = 1. In terms of continued fractions, 
the fraction !7%/39 is the value of the initial portion 4 + 1⁄7 + 17 + 1 +17 + 17 of 
the continued fraction for v19, with the final term of the period omitted. 

Since the topograph of x? — 19y” is periodic along the separator line, there are 
infinitely many different solutions of x° — 19y* = 1 along the separator line. Going 
toward the left just gives the negatives ~*/,, of the fractions */,, to the right, so since 
we are only interested in the positive solutions it will suffice to see what happens 
toward the right. One way to do this is to use the linear fractional transformation 
that gives the periodicity translation toward the right. This transformation sends the 
edge (%,%) of the Farey diagram to the edge (!7%9, 414,79). Here 741,79 is the 
value of the continued fraction 4 + 1 + 1⁄1 + 17 + 1 +17 + 14 obtained from 
the continued fraction for v19 by replacing the final number 8 in the period by one- 
half of its value, 4. The figure above shows why this is the right thing to do. We then 
get an infinite sequence of larger and larger positive solutions of x° — 19y° = 1 by 


170 741 


repeatedly applying the periodicity transformation with matrix ( 39 eo) to go from 


one solution to the next. For example, 
170 741\ (170\ _ (57799 
39 170 39} \ 13260 


so the next solution of x° —19y° = 1 after (170, 39) is (57799, 13260), and we could 
compute more solutions if we wanted. Obviously they are getting large rather quickly. 


100 Chapter 4 — Quadratic Forms 


The two 170’s in the matrix ee os) can hardly be just a coincidence. Notice 


also that the entry 741 factors as 19-39 which hardly seems like it should be just a 
coincidence either. Let us check that these numbers had to occur. In general, for the 
form x° — dy? let us suppose that we have found the first solution (x, y) = (p,q) 
after (1,0) for Pell’s equation x° — dy? = 1, so p°? — dq? = 1. Then based on the 
previous example we suspect that the periodicity transformation is: 


(5)- (2 BG) (aes) 
y 4 PpP Y qx + py 
To check that this is correct, the main thing to verify is that T preserves the values of 
the quadratic form. Substituting (px + dqy,qx + dy) for (x,y) in x? — dy? gives: 
(px + dqy)°—d(qx + py) 

= p°x’ + 2pdqxy + d’*q’y? — dq? x? — 2pdqxy - dp? y? 

= (p? — dq?)x? - d(p? - dq?) y? 

=x*-dy* since p°- dq? =1 
So T does preserve the values of the form. In particular T takes regions in the to- 
pograph with positive values to other such regions, and similarly for regions with 
negative values, so the separator line is taken to itself. The determinant of Y ae is 
p°? — dq? = 1 which is positive so T preserves orientation and hence it has to be a 
translation along the separator line. Since we chose (p,q) to be the first solution of 
x? —dy* = 1 after (1,0), it follows that T is the periodicity transformation and all 
occurrences of the label 1 along the separator line are images of the one at / under 
positive or negative powers of T. (We have not actually proved yet that periodic sep- 
arator lines always exist for forms x° — dy?, but this will be shown in Theorem 5.2.) 

There are no other solutions of x° — 19y* = 1 besides the ones along the sep- 
arator line because, as we will see in Section 5.1, the values in a topograph with a 
separator line change in a monotonic fashion as one moves away from the separa- 
tor line, steadily increasing toward +o on the positive side of the separator line and 
steadily decreasing toward —oo on the negative side. Thus the value 1 can occur only 
along the separator line itself. The monotonicity property also implies that the value 
—1 never appears in the topograph of x? — 19y° since it does not appear along the 
separator line, so the negative Pell equation x* — 19y* = —1 has no integer solutions. 
For an example where x* — dy* = —1 does have solutions, let us look again at 


the earlier example of x° — 13°: 


i 65 
0 
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The first positive solution (x,y) = (p,q) of x? — 13y? = —1 corresponds to the 
value —1 in the middle of the figure. This is determined by the continued fraction 
Pg = 3 +A +A +N +1 = 1%, so we have (p,q) = (18,5). The matrix 
(a A in this case is ga aa with determinant 18° — 13-5? = —1 so this gives the 
glide reflection along the periodic separator line taking Y to 18/5 and % to 64g. 
The smallest positive solution of x° — 13y* = +1 is obtained by applying this glide 


reflection to (18,5), which gives: 


18 65\/18\ _ (324+325\_ (649 
5 18/\5/]~ \ 90+90 }~ \180 


Repeated applications of the glide reflection will give solutions of x° — 13y* = -1 
and x° —13y* = +1 alternately. 


Exercises 


1. For the quadratic form x° — 14? do the following things: 

(a) Draw the separator line in the topograph and compute the continued fraction for 
V14. 

(b) Find the smallest positive integer solutions of x? —14y* = 1 and x*-14y* = -1, 
if these equations have integer solutions. 

(c) Find the linear fractional transformation that gives the periodicity translation along 
the separator line and use this to find a second positive solution of x° — 14y* = 1. 
(d) Determine the integers n with |n| < 12 such that the equation x° - 14y* = n has 
an integer solution. (Do not forget the possibility that there could be solutions (x, y) 
that are not primitive.) 


2. For the quadratic form x? — 29y? do the following things: 

(a) Draw the separator line and compute the continued fraction for v29. 

(b) Find the smallest positive integer solution of x° — 29y* = -1. 

(c) Find a glide reflection symmetry of the separator line and use this to find the 
smallest positive integer solution of x° — 29y° = 1. 


3. Show that every positive integer that is not a square can be expressed as a quotient 
n?- 14.2 for a suitably chosen pair of integers n and k, and in fact there are infinitely 
many different choices for such a pair. Why did we exclude squares? 
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We can divide quadratic forms Q(x, y) = ax? + bxy + cy” with integer coef- 
ficients a,b,c into four broad classes according to the signs of the values Q(x,y), 
where as usual we restrict x and y to be integers. We will always assume at least one 
of the coefficients is nonzero, so Q is not identically zero, and we will always assume 
(x,y) is not (0,0). There are four possibilities: 


(I) If Q(x, y) takes on both positive and negative values but not O then we call 

Q a hyperbolic form. 

(ID If Q(x, y) takes on both positive and negative values and also the value 0 then 
we Call Q a 0-hyperbolic form. 

(III) If Q(x, y) takes on only positive values or only negative values then we call Q 
an elliptic form. 

(IV) If Q takes on the value O and either positive or negative values, but not both, 
then Q is called a parabolic form. 


The hyperbolic-elliptic-parabolic terminology is motivated in part by what the level 
curves ax? + bxy + cy? = k are when we allow x and y to take on all real values 
so that one gets actual curves. The level curves are hyperbolas in cases (I) and (ID), 
and ellipses in case (III). In case (IV), however, the level curves are not parabolas as 
one might guess, but straight lines. From the classical perspective of conic sections 
parabolas are the transitional case between hyperbolas and ellipses, but from another 
viewpoint one can pass from hyperbolas to ellipses through a transitional case of a 
pair of parallel lines as in the family of curves x° — cy? = 1 which are hyperbolas for 
c > 0, ellipses for c < 0, and a pair of parallel lines for c = 0. Parabolic forms are 
much simpler than the other types and we will not be spending much time on them. 
As we will show later in the chapter, there is an easy way to distinguish the four 
types of forms ax? + bxy + cy? in terms of their discriminants A = b° — 4ac: 


(I) If A is positive but not a square then Q is hyperbolic. 
(II) If A is positive and a square then Q is 0-hyperbolic. 
(Ill) If A is negative then Q is elliptic. 

(IV) If A is zero then Q is parabolic. 
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Discriminants play a central role in the theory of quadratic forms. A natural question 
to ask is whether every integer occurs as the discriminant of some form, and this is 
easy to answer. For a form ax* + bxy + cy? we have A = b° — 4ac, and this is 
congruent to b? mod 4. A square such as b° is always congruent to 0 or 1 mod 4, 
so the discriminant of a form is always congruent to 0 or 1 mod 4. Conversely, for 
every integer A congruent to 0 or 1 mod 4 there exists a form whose discriminant 
is A. The simplest ones are: 


x? —ky* with discriminant A = 4k 

x? +xy —ky?* with discriminant A = 4k + 1 
Here k can be positive, negative, or zero. The forms x° —ky* and x*+xy-—ky? are 
called the principal quadratic forms of these discriminants. 


5.1 The Four Types of Forms 


We will analyze each of the four types of forms in turn, but before doing this let 
us make a few preliminary general comments. 

In the arithmetic progression rule for labeling the four regions surrounding an 
edge of the topograph, we can label the edge by the common q 
increment h = (q +r) -p =s- (q +r) as in the figure at h 
the right. The edge can be oriented by an arrow showing the 
direction in which the progression increases by h. Changing 
the sign of h corresponds to changing the orientation of the edge. In the special case 
that h happens to be 0 the orientation of the edge is irrelevant and can be omitted. 

The values of the increment h along the boundary of a region in the topograph 
have the interesting property that they also form an arithmetic progression when all 
these edges are oriented in the same direction, and the amount by which h increases 
as we move from one edge to the nextis 2p where p is the label on the region adjacent 
to all these edges: 


We will call this property the second arithmetic progression rule. To see why it is 
true, start with the edge labeled h in the figure, with the adjacent regions labeled p 
and q. The original arithmetic progression rule then gives the value p + q +h in the 
next region to the right. From this we can deduce that the label on the edge between 
the regions labeled p and p +q + h must be h + 2p since this is the increment from 
q to p+(p+q+h). Thus the edge label increases by 2p when we move from one 


104 Chapter 5 — The Classification of Quadratic Forms 


edge to the next edge to the right, so by repeated applications of this fact we see that 
we have an arithmetic progression of edge labels all along the border of the region 
labeled p. 

Another thing worth noting at this point is something that we will refer to as the 
monotonicity property. This says that in the figure at 
the right, if the three labels p, q, and h adjacent to 
an edge are all positive, then so are the three labels 
for the next two edges in front of this edge, and the 
new labels are larger than the old labels. It follows 
that when one continues forward going out this part 
of the topograph, all the labels become monotonically 
larger the farther one goes. Similarly, when the original three labels are negative, all 
the labels become larger and larger negative numbers. 


Next we have a very useful way to compute the discriminant of a form directly 
from its topograph: 


Proposition 5.1. If an edge in the topograph of a form Q(x,y) is labeled h with 
adjacent regions labeled p and q, then the discriminant of Q(x, y) is h? —4pq. 


Proof: For the given form Q(x,y) = ax? + bxy + cy”, the Y and % regions in 
the topograph are labeled a and c, and the edge in the topograph 
separating these two regions has h = b since the 1, region is 
labeled a+ b + c. So the statement of the proposition is correct 
for this edge. For other edges we proceed by induction, moving 
farther and farther out the tree. For the induction step suppose 
we have two adjacent edges labeled h and k as in the figure, and 
suppose inductively that the discriminant equals h*—4pq. We have r = p+q+h, and 
from the second arithmetic progression rule we know that k = h + 2q. Then we have 
k? —4qr = (h+2q)* —4q(p+q+th) = h? +4hq+4q* -4pq - 4q? —4hq = h° —4paq, 
which means that the result holds for the edge labeled k as well. o 


Elliptic forms have fairly simple qualitative behavior, so let us look at these forms 
first. Recall that we defined a form Q(x, y) to be elliptic if it takes on only positive 
or only negative values at all integer pairs (x,y) # (0,0). The positive and negative 
cases are equivalent since one can switch from one to the other just by putting a minus 
sign in front of Q. Thus it suffices to consider the case that Q takes on only positive 
values, and we will always assume we are in this case whenever we are dealing with 
elliptic forms. We will also generally assume when we look at topographs of elliptic 
forms that the orientations of the edges are chosen so as to give positive h-values, 
unless we state otherwise. 

For a positive elliptic form Q let p be the minimum positive value taken on by 
Q, so Q(x,y) = p for some (x,y) + (0,0). Here (x,y) must be a primitive pair 
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otherwise Q would take on a smaller positive value than p. Thus there is a region 
in the topograph of Q with the label p. All the edges having one endpoint at this 
region must be oriented away from the region, by the arithmetic 
progression rule and the assumption that p is the minimum value 
of Q. The monotonicity property then implies that all edges farther 
away from the p region are also oriented away from the region, and 
the values of Q increase steadily as one moves away from the p 
region. 

For the edges making up the border of the p region we know 
that the h-labels on these edges form an arithmetic progression 


with increment 2p, provided that we temporarily re-orient these edges so that they 
all point in the same direction. If some edge bordering the p region has the label h = 0 
then the topograph has the form shown in the first figure below, with the orientations 
on edges that give positive h-labels. An example of such a form is px* + qy*. We 
call the 0-labeled edge a source edge since all other edges are oriented away from 
this edge. 


The other possibility is that no edge bordering the p region has label h = O. 
Then since the labels on these edges form an arithmetic progression, there must be 
some vertex where the terms in the progression change sign. Thus when we orient the 
edges to give positive h-labels, all three edges meeting at this vertex will be oriented 
away from the vertex, as in the second figure above. We call this a source vertex since 
all edges in the topograph are oriented away from this vertex. 

If the three regions surrounding a source vertex are labeled p,q,r 
then the fact that the three edges leading from this vertex all point 
away from the vertex is equivalent to the three inequalities p < q+r, Y 
q<p+r,andr <p +q. These are called triangle inequalities since they are satisfied 
by the lengths of the three sides of any triangle. In the case of a source edge one of 
the inequalities becomes an equality, for example r = p + q in the earlier figure with 
a source edge. 

As we know, any three integers p, q,r can be realized as the three labels surround- 
ing a vertex in the topograph of some form. If these are positive integers satisfying 
the triangle inequalities then this vertex is the source vertex of an elliptic form since 
these inequalities imply that the three edges at this vertex are oriented away from 
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the vertex, so the monotonicity property guarantees that all values of the form are 
positive. The situation for source edges is simpler since any two positive integers p 
and q determine an elliptic form with a source edge having adjacent regions labeled 
p and q as in the earlier figure. 


Now let us move on to hyperbolic forms, whose topographs have quite a different 
appearance from the topographs of elliptic forms. Most notably, the topographs of 
hyperbolic forms always contain a periodic separator line of the sort that we saw in 
several of the examples in the previous chapter. Here is the general statement: 


Theorem 5.2. In the topograph of a hyperbolic form the edges for which the two 
adjacent regions are labeled by numbers of opposite sign form a line which is 
infinite in both directions, and the topograph is periodic along this line, with other 
edges of the topograph leading off the line on both sides. 


Proof: For a hyperbolic form Q all regions in the topograph have labels that are either 
positive or negative, never zero, and there must exist two regions of opposite sign. 
By moving along a path in the topograph joining two such regions we will somewhere 
encounter two adjacent regions of opposite sign. Thus there must exist edges whose 
two adjacent regions have opposite sign. Let us call these edges separating edges. 

At an end of a separating edge the value of Q in the next region must be either 
positive or negative since Q does not take the value 0: 


+ + 
+ — 


This implies that exactly one of the two edges at each end of the first separating edge 
is also a separating edge. Repeating this argument, we see that each separating edge 
is part of a line of separating edges that is infinite in both directions, and the edges 
that lead off from this line are not separating edges. 

The monotonicity property implies that as we move off this line of separating 
edges the values of Q are steadily increasing through positive integers on the posi- 
tive side and steadily decreasing through negative integers on the negative side. In 
particular this means that there are no other separating edges that are not on the 
initial separator line, so there is only one separator line. 

It remains to prove that the topograph is periodic along the separator line. We 
can assume all the edges along the separator line are oriented in the same direction 
by changing the signs of the h values if necessary. For an edge of the separator line 
labeled h with adjacent regions labeled p and —q with p > 0 and q > 0, we know 
that h? + 4pq is the discriminant A, by Proposition 5.1. The equation A = h? + 4pq 
with p and q positive implies that A is positive and furthermore that each of |h], 
p, and q is less than A. Thus there are only finitely many possible values for h, p, 
and q along the separator line since A is a constant depending only on Q. It follows 
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that there are only finitely many possible combinations of values h, p, and q at each 
edge on the separator line. Since the separator line is infinite, there must then be two 
edges on the line that have the same values of h, p, and q. Since the topograph is 
uniquely determined by the three labels h, p, q at a single edge, the translation of 
the line along itself that takes one edge to another edge with the same three labels 
must preserve all the labels on the line. This shows that the separator line is periodic. 

There must be edges leading away from the separator line on both the positive 
and the negative side, otherwise there would be just a single region on one side of 
the line, and then the second arithmetic progression rule would say that the h labels 
along the line formed an infinite arithmetic progression with nonzero increment 2p 
where p is the label on the region in question. However, this would contradict the 
fact that these h labels are periodic. o 


The qualitative behavior of the topograph of a hyperbolic form away from the 
separator line fits the pattern we have seen in examples. Since the separator line is 
periodic the whole topograph is periodic, consisting of repeating sequences of trees 
leading off from the separator line on each side, with monotonically increasing pos- 
itive values of the form on each tree on the positive side of the separator line and 
monotonically decreasing negative values on the negative side, as a consequence of 
the monotonicity property. 


The remaining types of forms to consider are parabolic forms and 0-hyperbolic 
forms. These turn out to be less interesting, and they play only a minor role in the 
theory of quadratic forms. 

Parabolic and 0-hyperbolic forms are the forms whose topograph contains at 
least one region labeled 0. By the second arithmetic progression rule, each edge 
adjacent to a 0 region has the same label h, and from this it follows that the labels 
on the regions adjacent to the 0 region form an arithmetic progression: 


When h = 0 the topograph is as shown in the following figure: 


254, .25q 254, .25q 254, .25q 
164 16q 164 164 164 164 
94 94 94 9q 94 9q 


4q 4q 4q 


Thus the form is parabolic, taking on only positive or only negative values away from 
the 0 region. An example of a form with this topograph is Q(x, y) = qx. Notice 
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that the topograph is periodic along the O region since it consists of the same tree 
pattern repeated infinitely often. 

The remaining case is that the label h on the edges bordering a O region is 
nonzero. The arithmetic progression of values of Q adjacent to the O region is 
then not constant, so it includes both positive and negative numbers, and hence Q is 
0-hyperbolic. If the arithmetic progression includes 
the value 0, this gives a second 0 region adjacent to 
the first one, and the topograph is as shown at the 
right. An example of a form with this topograph is 
Q(x, y) = qxy, with the two 0 regions at */) = W 
and %4. 

If the arithmetic progression of values of Q adjacent to the 0 region does not 
include 0, there will be an edge separating the positive from the negative values in 


the progression. We can extend this separating edge to a line of separating edges as 
we did with hyperbolic forms. If this extension does not eventually terminate with a 
second 0 region, the reasoning we used in the hyperbolic case would yield two edges 
along this line having the same h and the same positive and negative labels on the 
two adjacent regions, forcing the line to be periodic in the direction of this extension. 
This in turn would force it to be periodic in both directions. But this is impossible 
since the line began with a 0 region at one end. Thus the topograph contains a finite 
separator line connecting two 0 regions. 

An example of such a form is Q(x, y) = qxy — py? = (qx — py)y which has 
the value 0 at ¥/⁄y = I% and at X/y = P/q. Here we must have |q| > 1 for the two 0 
regions to be nonadjacent. The separator line must then follow the strip of triangles 
in the Farey diagram corresponding to the continued fraction for P/g. For example, 
for P/g = 2/; the topograph of the form 5xy — 2y* = (5x — 2y)y is the following: 


This completes our description of what the topographs of the four types of forms 
look like. We can also deduce the characterization of each type in terms of the dis- 
criminant: 


Proposition 5.3. The four types of forms are distinguished by their discriminants, 
which are negative for elliptic forms, positive nonsquares for hyperbolic forms, 
positive squares for 0-hyperbolic forms, and zero for parabolic forms. 
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Proof: Consider first an elliptic form Q , which we may assume takes on only positive 
values since changing Q to —Q does not change the discriminant. The topograph 
of Q contains either a source vertex or a source edge. For a source edge with the 
label h = 0 separating regions with positive labels p and q the discriminant is A = 
h? —4pq = —4paq, which is negative. For a source vertex with adjacent regions having 
positive labels p,q,r the edge between the p and q regions is labeled h=p+q-r 
so the discriminant can be expressed in the following way: 
A=h* -4pq = (p+q-r)* -4pq 
= p? +q? +r? —2pq —2pr —2qr 
=p(p-4-r)+qaq-p-r)+r(r—-p-—4@) 
In the last line the three quantities in parentheses are negative by the triangle inequal- 
ities, so A is again negative. 

For a parabolic form the topograph contains a region labeled 0 bordered by edges 
labeled 0, so A = h? —4pq = 0. A 0-hyperbolic form has a region labeled 0 bordered 
by edges all having the same nonzero label h so A = h°, a positive square. 

For an edge in the separator line for a hyperbolic form the adjacent regions have 
labels p and —q with p and q positive so A = h? + 4pq is positive. To see that 
A is not a square, suppose the form is ax? + bxy + cy”. Here a must be nonzero, 
otherwise the form would have the value 0 at (x, y) = (1,0), whichis impossible fora 
hyperbolic form. If the discriminant was a square then the equation az? + bz+c =0 
would have a rational root z = */,) with y + 0 by the familiar quadratic formula 
z = (-b + Vb? — 4ac)/2a. Thus we would have aly) + b(X/,) +c = 0 and hence 
ax*+bxy+cy" = 0, so the form would have the value 0 at a pair (x, y) with y #0, 
which is again impossible for a hyperbolic form. o 


The presence or absence of periodicity in a topograph has the following conse- 
quence: 


Proposition 5.4. If an equation Q(x, y) = n with n + 0 has one integer solution 
(x,y) then it has infinitely many integer solutions when Q is hyperbolic or para- 
bolic, but only finitely many integer solutions when Q is elliptic or 0 - hyperbolic. 


Proof: Consider first the hyperbolic and parabolic cases. Suppose (x, y) is a solution 
of Q(x, y) = n. If (x,y) is a primitive pair, then n appears in the topograph of 
Q so by periodicity it appears infinitely often, giving infinitely many solutions of 
Q(x, y) =n. If there is a nonprimitive solution (x,y) then itis d times a primitive 
pair (x', y’) with Q(x’, y’) = "/g2. The latter equation has infinitely many solutions 
(x’, y") by what we just showed, hence Q(x, y) = n has infinitely many solutions 
(x,y) = (dx',dy’). 

For elliptic and 0-hyperbolic forms there is no periodicity, and the monotonicity 
property implies that each number appears in the topograph at most a finite number 
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of times. Thus Q(x, y) = n can have only finitely many primitive solutions. If it had 
infinitely many nonprimitive solutions, these would yield infinitely many primitive 
solutions of equations Q(x, y) = m for certain divisors m of n. However, this is 
impossible since each equation Q(x,y) = m for a fixed m can have only finitely 
many primitive solutions and n has only finitely many divisors since we assume it is 
nonzero. Oo 


Exercises 


1. (a) Find two primitive elliptic forms ax? + cy” that have the same discriminant 
but take on different sets of values. Draw enough of the topographs of the two forms 
to make it apparent that they do not have exactly the same sets of values. (Remember 
that the topograph only shows the values Q(x, y) for primitive pairs (x, y).) 

(b) Do the same thing with hyperbolic forms ax? + cy. 


2. (a) Show the quadratic form Q(x, y) = 92x?-74xy+15y° is elliptic by computing 
its discriminant. 

(b) Find the source vertex or edge in the topograph of this form. 

(c) Using the topograph of this form, find all the integer solutions of 92x* — 74xy + 
15y* = 60, and explain why your list of solutions is a complete list. (There are exactly 
four pairs of solutions +(x, y), three of which will be visible in the topograph.) 


3. Show that if a form takes the same value on two adjacent regions of its topograph, 
then these regions are both adjacent to the source vertex or edge when the form is 
elliptic, or both lie along the separator line when the form is hyperbolic. 


4. Show that the minimum value of |h| for all the edges in the border of a given 
region in the topograph of an elliptic or hyperbolic form occurs at an edge having an 
endpoint that achieves the minimum distance to the separator line or source vertex 
or edge of all vertices in the border of the given region. 


5. (a) Show that if a quadratic form Q(x,y) = ax? + bxy + cy” can be factored 
as a product (Ax + By)(Cx + Dy) with A,B,C,D integers, then Q takes the value 
O at some pair of integers (x,y) # (0,0), hence Q must be either 0-hyperbolic or 
parabolic. Show also, by a direct calculation, that the discriminant of this form is a 
square. 

(b) Find a 0-hyperbolic form Q(x, y) such that Q(1,5) = 0 and Q(7, 2) = 0 and draw 
a portion of the topograph of Q that includes the two regions where Q(x, y) = 0. 
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5.2 Equivalence of Forms 


In the pictures of topographs we have drawn we often omit the fractional labels 
*/y for the regions in the topograph since the more important information is often just 
the values Q(x,y) of the form. This leads to the idea of considering two quadratic 
forms to be equivalent if their topographs “look the same” when the labels */,, are 
disregarded. For a precise definition, one can say that quadratic forms Q, and Q» 
are equivalent if there is a vertex v, in the topograph of Q, and a vertex v, in 
the topograph of Q, such that the values of Q, in the three regions surrounding 
vı are equal to the values of Q, in the three regions surrounding v». For example 
if the values at v, are 2,2,3 then the values at v, should also be 2,2,3, in any 
order, but 2,3,3 is regarded as different from 2, 2,3. Since the three values around 
a vertex determine all the other values in a topograph, having the same values at one 
vertex guarantees that the topographs look the same everywhere if the labels */,, are 
omitted. 

An alternative definition of equivalence of forms would be to say that two forms 
are equivalent if there is a linear fractional transformation in LF(Z) that takes the 
topograph of one form to the topograph of the other form. This is really the same 
as the first definition since there is a vertex of the topograph in the center of each 
triangle of the Farey diagram and we know that elements of LF(Z) are determined by 
where they send a triangle, so if two topographs each have a vertex surrounded by 
the same triple of numbers, there is an element of LF(Z) taking one topograph to the 
other, and conversely. 

A topograph and its mirror image correspond to equivalent forms since the mirror 
image topograph has the same three labels around each vertex as at the corresponding 
vertex of the original topograph. For example, switching the variables x and y reflects 
the circular Farey diagram across its vertical axis and hence reflects the topograph of a 
form Q(x,¥y) to the topograph of the equivalent form Q(y,x). As another example, 
the forms ax? + bxy + cy* and ax? —bxy + cy? are always equivalent since they 
are related by changing (x,y) to (—x,y), reflecting the Farey diagram across its 
horizontal axis, with a corresponding reflection of the topograph. 

Equivalent forms have the same discriminant since the discriminant of a form 
is determined by the three numbers surrounding any vertex, as these three numbers 
determine the numbers p,q,h at each edge abutting the vertex and the discriminant 
is h? — 4pq for any of these edges. 

Our next goal will be to see how to compute all the different equivalence classes 
of forms of a given discriminant. The method for doing this will depend on which of 
the four types of forms we are dealing with. 
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Let us look at elliptic forms first to see how to determine all the different equiv- 
alence classes for a given discriminant in this case. As usual it suffices to consider 
only the forms with positive values. At a source vertex or edge in 


a+c+h 


the topograph of a positive elliptic form Q let the smaller two of 
the three adjacent values of Q be a and c with a < c, and let the 
edge between them be labeled h = 0. The third of the three small- 
est values of Q is then a +c — h. The form Q is equivalent to the 
form ax? + hxy + cy* which has the values a, c, and at+h+c 


for (x,y) = (1,0), (0,1), and (1,1). Since a and c are the smallest a. 
values of Q we have a < c < a + c — h, and the latter inequality is 
equivalent to h < a. Summarizing, we have the inequalities O<h<a<c. 

Thus every positive elliptic form is equivalent to a form ax* + hxy + cy? with 
0<h<a<c. An elliptic form satisfying these conditions is called reduced. Two 
different reduced elliptic forms with the same discriminant are never equivalent since 
a and c are the labels on the two regions in the topograph where the form takes its 
smallest values, and h is determined by a, c, and A via the formula A = h° — 4ac 
since we assume h > 0. 

To avoid dealing with negative numbers let us set A = —D with D > 0, so the 
discriminant equation becomes D = 4ac—h’. To find all equivalence classes of forms 
of discriminant —D we therefore need to find all solutions of the equation 


4ac =h*+D with O<h<a<c 


This equation implies that h must have the same parity as D, and we can bound the 
choices for h by the inequalities 4h? < 4a? < 4ac = D + h? which imply 3h? < D, 
or h? < PA. This limits h to a finite number of possibilities, and for each of these 
values of h we just need to find all of the finitely many factorizations of h? + Das 
4ac with a < c and h < a. In particular this shows that there are just finitely many 
equivalence classes of elliptic forms of a given discriminant. 

As an example consider the case A = —260, so D = 260. Since A is even, so is h, 
and we must have h* < 262/, so h must be 0, 2, 4, 6, or 8. The corresponding values 
of a and c that are possible can then be computed from the equation 4ac = 260+h’, 
always keeping in mind the requirement that h < a < c. The possibilities are shown 
in the following table: 


h 

O | 65 | (1,65), (5,13) 

2 | 66 | (2,33), (3,22), (6,11) 
4 |69 | — 

6 |74 | — 

8 | 81 | (9,9) 


As a side comment, note that the values of ac increase successively by 1,3,5,7,---. 
This always happens when A is even, so the h values are 0, 2,4,6,---. For odd A 
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the values of h are 1,3,5,7,--- and the increments for ac are 2,4,6,8,---. (Let it 
be an exercise for the reader to figure out why these statements are true.) 

From the table we see that every positive elliptic form of discriminant —260 is 
equivalent to one of the six reduced forms x? +65y°, 5x? +13y?, 2x°+2xy+33y", 
3x? + 2xy + 22y*, 6x? + 2xy + 11y?, or 9x? + 8xy + 9y?, and no two of these 
reduced forms are equivalent to each other. Here are small parts of the topographs 
of these forms: 


a e +33y? ne +lly? 


ke kf ok Ff 


eee ere. 9x? ee +9y? 


In the first two topographs the central edge is a source edge, and in the last four 
topographs the lower vertex is a source vertex. 

One might wonder what would happen if we continued the table with larger values 
of h not satisfying h? < 26/,. For example for h = 10 we would have ac = 90 so the 
condition a < c would force a to be 9 or less, violating the condition h < a. Larger 
values of h would run into similar difficulties. The condition h? < P/⁄ saves one the 
trouble of trying larger values of h. 


Next we consider hyperbolic forms of a given discriminant A > 0. The topograph 
of a hyperbolic form has a separator line, so for each edge in the separator line we 
have the edge label h with the adjacent regions labeled p and —q for p > 0O and 
q > 0. We can assume h = 0 by reorienting the edge if necessary. The discriminant 
equation is A = h? +4pq. Since p and q are positive this implies h° < A so there are 
only finitely many possibilities for h along the separator lines of forms of the given 
discriminant A. For each h we then look at the factorizations A — h° = 4pq. There 
can be only finitely many of these, so this means there are just finitely many possible 
combinations of labels h, p, —q and hence only finitely many possible separator lines. 
Thus the number of equivalence classes of hyperbolic forms of a given discriminant 
is finite. 

As an example, let us determine all the quadratic forms of discriminant 60, up 
to equivalence. Two obvious forms of discriminant 60 are x? —- 15y? and 3x? -5y°, 
whose separator lines consist of periodic repetitions of the following two patterns: 
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From the topographs it is apparent that these two forms are not equivalent, and also 
that the negatives of these two forms, -x° + 15y* and —3x* + 5y*, give two more 
inequivalent forms, for a total of four equivalence classes so far. To see whether 
there are others we use the formula A = 60 = h? + 4pq relating the values p and 
—q along an edge labeled h in the separator line, with p > 0 and q > 0. The various 


possibilities are listed in the table below. The equation A = h? + 4pq implies that h 
and A must have the same parity, just as in the elliptic case. 


h | pa | (p,q) 


O |15 | (1,15), (3,5), (5,3), (15,1) 

2 |14 | (1,14), (2,7), (7,2), (14,1) 

4 | 11 (1,11), (11,1) 

6 |6 (1,6), (2,3), (8,2), (6,1) 
Each pair of values for (p,q) in the table occurs at some edge along the separator 
line in one of the two topographs shown above or the negatives of these topographs. 
Hence every form of discriminant 60 is equivalent to one of these four. If it had 
not been true that all the possibilities in the table occurred in the topographs of the 
forms we started with, we could have used these other possibilities for h, p , and q to 
generate new topographs and hence new forms, eventually exhausting all the finitely 
many possibilities. 

The procedure in this example works for all hyperbolic forms. One makes a list of 
all the positive integer solutions of A = h* + 4pq, then one constructs separator lines 
that realize all the resulting pairs (p,q). The different separator lines correspond 
exactly to the different equivalence classes of forms of discriminant A. Each solution 
(h, p,q) givesaform px?+hxy—qy*. These are organized into cycles corresponding 
to the pairs (y,—q) occurring along one of the periodic separator lines. Thus in the 
preceding example with A = 60 the 14 pairs (p,q) in the table give rise to the four 
cycles along the four different separator lines. 

A hyperbolic form ax* + bxy +cy° belongs to one of the cycles for the discrim- 
inant A = b? — 4ac exactly when a > 0 and c < 0 since a and c are the numbers p 
and —q lying on opposite sides of an edge of the separator line when (x,y) = (1,0) 
and (0,1). 

If we superimpose the separator line of a hyperbolic form on the associated in- 
finite strip in the Farey diagram, we see that the forms within a cycle correspond to 
the edges of the Farey diagram that lie in the strip and join one border of the strip to 
the other. For example, for the form 3x* — 5y? we obtain the following picture, with 
fans of two triangles alternating with fans of three triangles: 

3 ae 3 aris 3 
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The number of forms within a cycle can be fairly large in general. The situation can 
be improved somewhat by considering only the “most important” forms in the cycle, 
namely the forms that correspond to those edges in the strip that separate pairs of 
adjacent fans, indicated by heavier lines in the figure above. In terms of the topograph 
itself these are the edges in the separator line whose two endpoints have edges leading 
away from the separator line on opposite sides. The forms corresponding to these 
edges are traditionally called the reduced forms within the given equivalence class. In 
the example of discriminant 60 these are the forms with (p,q) = (1,6), (6,1), (3,2), 
and (2,3). These are the forms x° +6xy-6y°, 6x°+6xy-y*, 3x? +6xy-2y?, and 
2x? + 6xy — 3y°. In this example there is just one reduced form for each cycle, but 
in more complicated examples there can be any number of reduced forms in a cycle. 
Note that the reduced forms do not necessarily give the simplest-looking forms, which 
in this example were the original forms x? — 15y? and 3x? — 5y” along with their 
negatives -x° +15y* and —3x* + 5y’°, or alternatively 15x* — y* and 5x* - 3y?. 


For 0-hyperbolic forms it is rather easy to determine all the equivalence classes 
of forms of a fixed discriminant. As we saw in our initial discussion of 0- hyperbolic 
forms, their topographs contain two regions labeled 0, and the labels on the regions 
adjacent to each 0-region form an arithmetic progression with increment given by the 
label on the edges bordering the 0-region. Previously we called this label h but now 
let us change notation and call it q. We may assume q is positive by re-orienting the 
edges if necessary. The discriminant is A = q* so both 0-regions must have the same 
edge label q. Either one of the two arithmetic progressions determines the form up to 
equivalence since two successive terms in the progression together with the 0 in the 
adjacent region give the three values of the form around a vertex in the topograph. 

The form qxy — py* has discriminant q* and has -p as one term of the arith- 
metic progression adjacent to the 0-region */, = 1/), namely in the region Ly = Y. 
Thus every 0-hyperbolic form of discriminant q° is equivalent to one of these forms 
qxy — py. Arithmetic progressions with increment q can be thought of as congru- 
ence classes mod 4, so only the mod q value of p affects the arithmetic progression 
and hence we may assume 0 < p < q. The number of equivalence classes of 0-hyper- 
bolic forms of discriminant q* is therefore at most q, the number of congruence 
classes mod q. However, the number of equivalence classes could be smaller since 
each form has two 0 regions and hence two arithmetic progressions, which could be 
the same or different. Since either arithmetic progression determines the form, if the 
two progressions are the same then the topograph must have a mirror symmetry in- 
terchanging the two O0-regions. This always happens for example if the two 0-regions 
touch, which is the case p = 0 so the form is qxy and the mirror symmetry just in- 
terchanges x and y. If we let r denote the number of forms qxy — py? without 
mirror symmetry then the number of equivalence classes of O0-hyperbolic forms of 
discriminant q? is q—r since each form without mirror symmetry has two different 
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arithmetic progressions giving the same form. 


For parabolic forms it is even easier to describe what all the different equivalence 
classes are since we have seen exactly what their topographs look like: There is a 
single region labeled O and all the regions adjacent to this have the same label q, 
which can be any nonzero integer, positive or negative. The integer q thus determines 
the equivalence class, so there is one equivalence class of parabolic forms for each 
nonzero integer q, with qx* being one form in this equivalence class. Parabolic forms 
all have discriminant 0, so in this case there are infinitely many different equivalence 
classes with the same discriminant. 


We have now shown how to compute all the equivalence classes of forms of a 
given discriminant for each of the four types of forms. In particular we have proved 
the following general fact: 


Theorem 5.5. There are only a finite number of equivalence classes of forms with 
a given nonzero discriminant. 


Exercises 


1. (a) For positive elliptic forms of discriminant A = —D, verify that the smallest 
value of D for which there are at least two inequivalent forms of discriminant —D is 
D=12. 

(b) If we add the requirement that all forms under consideration are primitive, then 
what is the smallest D? 


2. Determine all the equivalence classes of positive elliptic forms of discriminants 
—67, —104, and —347. 


3. Find two elliptic forms that are not equivalent but take on the same three smallest 
values a<b<c. 


4. Determine the number of equivalence classes of quadratic forms of discriminant 
A = 120 and list one form from each equivalence class. 


5. Do the same thing for A = 61. 


6. (a) Find the smallest positive nonsquare discriminant for which there is more than 
one equivalence class of forms of that discriminant. (In particular, show that all 
smaller discriminants have only one equivalence class.) 

(b) Find the smallest positive nonsquare discriminant for which there are two inequiv- 
alent forms of that discriminant, neither of which is simply the negative of the other. 


7. (a) Determine all the equivalence classes of 0-hyperbolic forms of discriminant 49. 
(b) Determine which equivalence class in part (a) each of the forms 7xy — py? for 
p = 0,1, 2,3,4,5,6 belongs to. 
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5.3 The Class Number 


When considering equivalence classes of forms of a given discriminant there are 
further refinements that turn out to be very useful. The first involves forms whose 
topographs are mirror images of each other. According to the definition we have 
given, two such forms are regarded as equivalent. However, there is a more refined 
notion of equivalence in which two forms are considered equivalent only if there is an 
orientation-preserving transformation in LF(Z) taking the topograph of one form to 
the topograph of the other. In this case the forms are called properly equivalent. 

To illustrate the distinction between equivalence and proper equivalence, let us 
look at the earlier example of discriminant A = —260 where we saw that there were 
six equivalence classes of forms: 


x*+65y? 2x*+2xy+33y? 6x?+2xy +11y? 

66 37 19 

18 27 26 
1 |65 2 | 33 6 |11 
5 | 13 3 | 22 9 | 9 

66 33 15 

18 23 10 
5x*+13y? 3x*4+2xy4+22y? 9x*+8xy+9y? 


In the first two topographs the central edge is a source edge and in the other four 
the lower vertex is a source vertex. Whenever there is a source edge the topograph 
has mirror symmetry across a line perpendicular to the source edge. When there is 
a source vertex there is mirror symmetry only when at least two of the three sur- 
rounding values of the form are equal, as in the third and sixth topographs above, 
but not the fourth or fifth topographs. Thus the mirror images of the fourth and 
fifth topographs correspond to two more quadratic forms which are not equivalent to 
them under any orientation-preserving transformation. With the more refined notion 
of proper equivalence there are therefore eight proper equivalence classes of forms 
of discriminant —260. 

To obtain explicit formulas for the mirror image forms we can interchange the 
coefficients a and c in ax? +bxy+cy*, which corresponds to interchanging x and 
y, reflecting the topograph across a vertical line. Alternatively we could change the 
sign of b, which corresponds to changing the sign of either x or y and thus reflecting 
the topograph across a horizontal line. 

For a general discriminant A each equivalence class of forms of discriminant A 
gives rise to two proper equivalence classes except when the class contains forms 
with mirror symmetry, in which case equivalence and proper equivalence amount to 
the same thing since every orientation-reversing equivalence can be converted into 
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an orientation-preserving equivalence by composing with a mirror reflection. Here we 
are using the fact that the only linear fractional transformations that take a topograph 
to itself and reverse orientation are mirror reflections, as will be shown in Section 5.4 
when we study symmetries of topographs in more detail. 


Multiplying a form by an integer d > 1 does not change its essential features in 
any significant way, so it is reasonable when classifying forms to restrict attention just 
to primitive forms, the forms that are not proper multiples of other forms. In other 
words, one considers only the forms ax* + bxy + cy* for which a, b, and c have 
no common divisor greater than 1. The primitivity of a form is detectable just from 
the numbers appearing in its topograph since all the numbers in the topograph of a 
nonprimitive form are divisible by some number d > 1, and conversely if all numbers 
in the topograph of a form ax*+bxy+cy* are divisible by d then in particular a, c, 
and a + b +c, the values at (1,0), (0,1), and (1,1), are divisible by d which implies 
that b is also divisible by d so the whole form is divisible by d. Thus primitivity 
is a property of equivalence classes of forms. Multiplying a form by d multiplies its 
discriminant by d?, so nonprimitive forms of discriminant A exist exactly when A is 
a square times another discriminant. For example, when A = —12 = 4(-—3) one has 
the primitive form x° +37 as well as the nonprimitive form 2x* + 2xy+2y* which 
is twice the form x° + xy + y? of discriminant —3. 


The number of proper equivalence classes of primitive forms of a given discrim- 
inant is called the class number for that discriminant, where in the case of elliptic 
forms one considers only the forms with positive values. The traditional notation for 
the class number for discriminant A is hy. (This h has nothing to do with the h 
labels on edges in topographs.) 

Since we have an algorithm for computing the finite set of equivalence classes of 
forms of a given discriminant, this leads to an algorithm for computing class numbers. 
When computing the table of triples (h,a,c) for elliptic forms or (h, p,q) for hyper- 
bolic forms we omit the nonprimitive triples since these correspond to nonprimitive 
forms. Then we determine which of the remaining forms have mirror symmetry. For 
elliptic forms these are the cases when one or more of the inequalities 0< h<a<c 
is an equality, as we will see in the next section. For hyperbolic forms mirror symme- 
tries can be detected in the separator line. Forms with mirror symmetry count once 
when computing the class number, and forms without mirror symmetry count twice. 
However, just having an algorithm to compute the class number h, does not make it 
transparent how h, depends on A, and indeed this is a very difficult question which 
is still only partially understood. 

Of special interest are the discriminants for which all forms are primitive. These 
are called fundamental discriminants. Thus a fundamental discriminant is one which 
is not a square times a smaller discriminant. For example, 8 is a fundamental dis- 
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criminant even though it is divisible by a square, 4, since the other factor 2 is not 
the discriminant of any form, as it is not congruent to 0 or 1 mod 4. Technically 
1 is a fundamental discriminant according to our definition, but we will exclude this 
trivial case. Thus fundamental discriminants are never squares, so fundamental dis- 
criminants appear only for elliptic and hyperbolic forms. With 1 excluded it is easy 
to check that the fundamental discriminants A with |A| < 40 are 5, 8, 12, 13, 17, 
20, 21, 24, 28, 29, 33, 37 and —3, —4, —7, —8, -—11, -15, -19, —20, —23, —24, 
—31, -35, -39. 

It is not hard to characterize precisely the discriminants A that are fundamental. 
First write A = 2*n with k = 0 and n odd, possibly negative. If any odd square 
divides n then we can factor this out of A and still get a discriminant since odd 
squares are congruent to 1 mod 4 so multiplying by an odd square does not affect 
whether a number is 0 or 1 mod 4. The exponent k in 2* can never be 1 since this 
would imply A = 2 mod 4. If k = 4 we can factor powers of 4 out of A until we have 
k equal to 2 or 3 and still have a discriminant. If k = 3 we cannot factor a 4 out of 
A since this would give the excluded case k = 1. If k = 2 we can factor 4 = 2* out of 
A exactly when n = 1 mod 4. Finally, when k = 0 we have A = n so we must have 
n = 1 mod 4. Thus fundamental discriminants other than —4 and +8 are of three 
types: 

= A=n with |n| a product of distinct odd primes and n = 1 mod 4. 
» A= 4n with |n| a product of distinct odd primes and n = 3 mod 4. 
» A= 8n with |n| a product of distinct odd primes. 


Every nonsquare discriminant can be factored uniquely as A = d*A’ where A’ is a 
fundamental discriminant and d > 1. The number d is called the conductor of A. 
Fundamental discriminants are those whose conductor is 1. Conductors will become 
important when we study the deeper properties of forms in later chapters. The class 
number h, is always a multiple of ha and there is a not-too-complicated formula 
for what this multiple is, so the determination of class numbers reduces largely to the 
case of fundamental discriminants. However, we will not be going into more detail on 
the relationship between h, and hy, since this would lead us somewhat outside the 
scope of the book. 


The question of which discriminants have class number 1 has been much studied. 
This amounts to finding the discriminants for which all primitive forms are equivalent 
since if all primitive forms are equivalent, they are all equivalent to the principal form 
which has mirror symmetry so they are all properly equivalent to the principal form. 

For elliptic forms the following nine fundamental discriminants have class num- 


ber 1: A = -3, —4, -7, -8, -11, -19, —43, -67, -163 


In addition there are four more which are not fundamental: —12, —16, —27, —28. It 
was conjectured by Gauss around 1800 that there are no other negative discriminants 
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of class number 1. Over a century later in the 1930s it was shown that there is 
at most one more, and then in the 1950s and 1960s Gauss’s conjecture was finally 
proved completely. 

Another result from the 1930s is that for each number n there are only finitely 
many negative discriminants with class number n. Finding what these discriminants 
are is a difficult problem, however, and so far this has been done only in the range 
n < 100. 

The situation for positive discriminants with class number 1 is not as well un- 
derstood. Computations show that there are a large number of positive fundamental 
discriminants with class number 1, and it seems likely that there are in fact infinitely 
many. However, this has not been proved and remains one of the most basic unsolved 
problems about quadratic forms. If one allows nonfundamental discriminants then 
it is known that there are infinitely many with ha = 1, including for example the 
discriminants A = 2°**! for k > 1 and A = 5°**! for k= 0. 


Returning to the nine negative fundamental discriminants of class number 1, it is 
easy to check in each case that all forms are equivalent. For example when A = —163 
and we apply the earlier algorithm to find all reduced forms we must have h odd with 
h? < 163/, so the only possibilities are h = 1,3,5, 7. From the equation 4ac = 163+h? 
the corresponding values of ac are 41,43,47,53 which all happen to be prime, and 
since a < c this forces a to be 1 in each case. But since h < a this means h must 
be 1, and we obtain the single quadratic form x° + xy + 41y°. 

The corresponding polynomial x? +x +41 has a curious property discovered by 
Euler: For each x = 0, 1,2,3,- - -,39 the value of x? + x +41 is a prime number. Here 
are these forty primes: 


41 43 47 53 61 71 83 97 113 131 151 173 197 223 251 281 313 
347 383 421 461 503 547 593 641 691 743 797 853 911 971 
1033 1097 1163 1231 1301 1373 1447 1523 1601 


Notice that the successive differences between these primes are 2,4,6,8,10,---,78 
since [(x +1)? + (x + 1) + 41] — [x* + x +41] = 2(x +1). The next number in 
the sequence after 1601 would be 1681 = 41°, not a prime. (Write x° + x + 41 as 
x(x + 1) +41 to see why x = 40 must give a nonprime.) A similar thing happens 
for the other negative fundamental discriminants of class number 1. The nontrivial 
cases are listed in the table below, where D = —A. 


+x4+2 2 


7 Ix 
11 | x? +x+3 35 

19 | x2 +x+5 |571117 

43 | x? +x+11 | 111317 23 31 41 53 67 83 101 

67 | x? +x +17 | 1719 23 29 37 47 59 73 89 107 127 149 173 199 227 257 
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Satisfactory explanations are known for the occurrence of so many prime values of 
these quadratic polynomials but they involve fairly deep theory. It is curious that the 
lists of prime values account for all primes less than 100 except 79. 

Suppose one asks about the next forty values of x* + x + 41 after the value 41° 
when x = 40. The next value, when x = 41,is 1763 = 41 -43, also not a prime. After 
this the next two values are primes, then comes 2021 = 43-47, then four primes, 
then 2491 = 47-53, then six primes, then 3233 = 53-61, then eight primes, then 
4331 = 61-71, then ten primes, then 5893 = 71-83. This last number was for x = 76, 
and the next four values are prime as well for x = 77, 78, 79, 80, completing the 
second 40 values. But then the pattern breaks down when x = 81 where one gets 
the value 6683 = 41-163. Thus, before the breakdown, not only were we getting 
sequences of 2, 4, 6, 8, 10 primes but the nonprime values were the products of two 
successive terms in the original sequence of prime values 41, 43, 47, 53, 61, --- 
All this seems quite surprising, even if the nice patterns do not continue forever. A 
partial explanation can be found in the fact that the polynomial P(x) = x° + x +41 
satisfies the identity P(40 + n°) = P(n — 1)P(n) as one can easily check, so when 
n = 1,2,3,--- we get P(41) = P(O)P(1) = 41-43, P(44) = 43-47, P(49) = 47-53, 
P(56) = 53-61, etc. However this does not explain why the intervening values of P(x) 
should be prime. The polynomials in the preceding table exhibit similar behavior. 


Exercises 


1. Compute the class number for the following discriminants: 
(a) -23 (b) -47 (c)-71 (d)-87 (e) -92 (f)145 (g) 148. 


2. In this extended exercise the goal will be to show that the only negative even dis- 
criminants with class number 1 are —4, —8, —12, —16, and —28. (Note that of these, 
only —4 and —8 are fundamental discriminants.) The strategy will be to exhibit an 
explicit reduced primitive form Q different from the principal form x* + dy" for 
each discriminant —4d with d > 4 except d = 7. This will be done by breaking the 
problem into several cases, where in each case a form Q will be given and you are 
to show that this form has the desired properties, namely it is of discriminant —4d, 
primitive, reduced, and different from the principal form. You should also check that 
the cases considered cover all possibilities. 

(a) Suppose d is not a prime power. Then it can be factored as d = ac where 1 <a<c 
and a and c are coprime. In this case let Q be the form ax* + cy’. 

(b) The form ax? + 2xy + cy? will work provided that d + 1 factors as d+ 1 = ac 
where a and c are coprime and 1 < a < c. If d is odd, for example a power of an odd 
prime, then d+ 1 is even so it has such a factorization d+ 1 = ac unless d+1 = 2”. 
(c) If d = 2” the cases we need to consider are n > 3 since d > 4. When n = 3 take 
Q to be 3x? + 2xy + 3y* and when n = 4 take Q to be 4x7 + 4xy + (2"* + 1)y°. 
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(d) When d +1 = 2” the cases of interest are n > 3. When n = 3 we have d = 7 
which is one of the allowed exceptions with class number 1. When n = 4 we have 
d= 15 and 3x* +5y* works as in part (a). When n = 5 we have d = 31 and we take 
the form 5x* +4xy + 7y*. When n = 6 we use the form 8x? + 6xy + (2% +1)y?. 


3. Show that the class number for discriminant A = q* > 1 is m(q) where (q) is 
the number of positive integers less than q and coprime to q. 


5.4 Symmetries of Forms 


We have observed that some topographs are symmetric in various ways. To give 
a precise meaning to this term, let us say that a symmetry of a form Q (or its to- 
pograph) is a transformation T in LF(Z) that leaves all the values of Q unchanged, 
so Q(T(x,yv)) = Q(x,y) for all pairs (x,y). For example, every hyperbolic form 
has a periodic separator line, which means there is a symmetry that translates the 
separator line along itself. If T is the symmetry translating by one period in either 
direction, then all the positive and negative powers of T are also translational sym- 
metries. Strictly speaking, the identity transformation is always a symmetry but we 
will sometimes ignore this trivial symmetry. 

Some hyperbolic forms also have mirror symmetry, where the symmetry is re- 
flection across a line perpendicular to the separator line. This reflector line could 
contain one of the edges leading off the separator line, or it could be halfway between 
two consecutive edges leading off the separator line on the same side. Both kinds of 
symmetry occur along the separator line of the form x? — 19y°, for example: 


Elliptic forms can have mirror symmetries as well, as we saw in the earlier example 
A = —260 where two topographs had mirror symmetry across a line perpendicular to 
an edge and two had symmetry across a line containing an edge. 

There is a simple characterization of when each of the two types of mirror sym- 
metry occurs in terms of coefficients: 


Proposition 5.6. (a) The forms whose topograph has a mirror symmetry reflecting 
across a line perpendicular to an edge and passing through its midpoint are exactly 
the forms equivalent to a form ax? + cy?. 


Section 5.4 — Symmetries of Forms 123 


(b) The forms whose topograph has a mirror symmetry reflecting across a line 
containing an edge of the topograph are exactly the forms equivalent to a form 
ax? + bxy +ay°. Alternatively, one could take forms ax? + axy + cy°, or forms 
ax*+exy+cy’. 


In particular the principal forms x° — ky? and x? + xy - ky? have mirror sym- 
metry, so there is at least one form with mirror symmetry in each discriminant. 


Proof: We will use the following figures: 


a+b+c a+c 


a-b+c 


The first figure shows the labels surrounding an edge in a topograph, the central edge 
in the figure. There is a mirror symmetry across a line perpendicular to this edge 
exactly when b = 0 since the labels a +b +c and a- b +c above and below the edge 
must be equal. This symmetry is shown in the second figure as reflection across the 
dotted line. The other type of mirror symmetry is reflection across the line containing 
the central edge, as in the third figure. This occurs exactly when a = c, and in this 
case d = 2a +b and e = 2a — b. A form whose topograph has one of these two types 
of mirror symmetry is thus equivalent to a form ax? + cy” or ax* + bxy + ay’, 
respectively, where the region to the left of the central edge is the Yo region and the 
region to the right is the 9/4, region. 

For the second type of mirror symmetry we could also choose one of the upper two 
edges to be the edge between the !/ and % regions and then the form would become 
ax’ +dxy+dy* or dx*+dxy+ay’. Conversely, every form ax? +dxy+dy? or 
dx? +dxy+ay” has the mirror symmetry shown in the figure since for example the 
upper left edge labeled d together with the adjacent regions labeled a and d force 
the region on the right to also be labeled a. o 


Corollary 5.7. The numbers appearing on reflector lines of mirror symmetries of 
topographs are always divisors of the discriminant. 


Proof: A form ax* + cy” as in the second figure above has discriminant A = —4ac 
so the labels a and c on the two regions bisected by the reflector line are divisors 
of A. In the third figure the numbers on the reflector line are d and e and these are 
divisors of A = d° — 4ad = e° — 4ae. o 


The converse question of which divisors of the discriminant occur in topographs 
and whether they occur only along reflector lines will be explored in Section 6.2. For 
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fundamental discriminants we will see that the numbers appearing on reflector lines 
are exactly the divisors of the discriminant, but for nonfundamental discriminants 
this need not be true. 


Let us consider now what sorts of symmetries are possible in general for the vari- 
ous types of forms, beginning with elliptic forms. For an elliptic form each symmetry 
must take the source vertex or edge to itself since this is where the smallest values 
of the form occur. In the case of a source edge, if a symmetry does not interchange 
the two ends of the source edge then the symmetry must be either the identity or a 
reflection across a line containing the source edge. If a symmetry does interchange 
the two ends of a source edge then it must either be a reflection across a line perpen- 
dicular to the edge or a 180 degree rotation of the topograph about the midpoint of 
the edge. Referring to the figure at the right, this rotation can only 


a+b+c 


give a symmetry if a = c and a+b +c = a-b +c whichis equivalent 
to having b = 0. Thus the form is ax’ + ay" so if it is primitive it 
is just x? + y’. Note that multiplying any form by a constant does 
not affect its symmetries so there is no harm in considering only 
primitive forms. For the form x* + y? note also that this form has 


both types of mirror symmetries, and the composition of these two a-b+c 
mirror symmetries is the 180 degree rotational symmetry. 

For a source vertex, a Symmetry must take this vertex to itself. If a symmetry is 
orientation-preserving and not the identity then it must be a rotation about the source 
vertex by either one-third or two-thirds of a full turn. In either case this means that 
the three labels around the source vertex must be equal, so if the source vertex is 
the lower vertex in the figure above then the condition is a = c = a — b + c, which is 
equivalent to saying a = b = c. The form is then ax*+axy+ay” soif it is primitive 
itis x? + xy + y*. The only other sort of symmetry for a source vertex is reflection 
across a line containing one of the three edges that meet at the source vertex. The 
only time there can be more than one such symmetry is when all three adjacent labels 
are equal so we are again in the situation of a form ax* + axy + ay°. 

For an elliptic form ax? + bxy + cy? that is reduced, so 0 < b < a < c, itis 
easy to recognize exactly when symmetries occur, namely when at least one of these 
three inequalities becomes an equality. Again using the figure above, when b = 0 one 
has a source edge with a mirror symmetry across the perpendicular line. When b = a 
we have a — b +c = so there is a mirror symmetry across the lower right edge. 
And when a = c one has mirror symmetry across the central edge. Since a and c 
are the two smallest labels on regions in the topograph, we see that reduced forms 
ax? +bxy + ay" occur when the smaller two of the three labels at the source vertex 
are equal, and reduced forms ax* + axy +cy* occur when the larger two labels are 
equal, at 94 and 714. 
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Certain combinations of equalities in 0 < b < a < c are also possible. If b = 0 and 
a = c the form is a(x? + y”) with a source edge and both types of mirror symmetry 
as well as 180 degree rotational symmetry. Another possibility is that b = a = c so 
the form is a(x? + xy + y*) with the symmetries described earlier. These are the 
only combinations of equalities that can occur since we must havea > 0 so 0=b=a 
is impossible. 

For reduced elliptic forms this exhausts all the possible symmetries since if we 
have strict inequalities 0 < b < a < c then the values of the form in the four regions 
shown in the figure above are all distinct. The first time this occurs is when the 
inequalities are 0 < 1 < 2 <3 so the formis 2x° + xy + 3y° of discriminant —23. 


Now consider hyperbolic forms. These all have periodic separator lines so they 
always have translational symmetries, and the question is what other sorts of sym- 
metries are possible. For a hyperbolic form each symmetry must take the separator 
line to itself since this line consists of the edges that separate positive from negative 
values of the form. It is a simple geometric fact that a symmetry of a line L that is 
divided into a sequence of edges, say of length 1, extending to infinity in both direc- 
tions, must be either a translation along L by some integer distance in either direction, 
or a reflection of L fixing either a vertex of L or the midpoint of an edge of L and 
interchanging the two halves of L on either side of the fixed point. This can be seen 
as follows. Symmetries of L are assumed to take vertices to vertices, so suppose the 
symmetry T sends a vertex v to the vertex T(v). Then if T preserves the orientation 
of L it must be a translation along L by the distance from v to T(v) as one can see 
by considering what T does to the two edges adjacent to v, then to the next two 
adjacent edges on either side, then the next two edges, and so on. If T reverses the 
orientation of L then either T(v) = v or T fixes the midpoint of the segment from v 
to T(v) since it sends this segment to a segment of the same length with one end at 
T(v) but extending back toward v since T reverses orientation of L. Thus T fixes a 
point of L in either case, and it follows that T must reflect L across this fixed point, 
as one can again see by considering the edge or edges containing the fixed point, then 
the next two edges, and so on. If the distance from v to T(v) is an even integer, the 
midpoint between v and T(v) will be a vertex, and if it is odd, the midpoint will be 
a midpoint of an edge. 

Returning to the situation of asymmetry T of the topograph of a hyperbolic form 
that takes the separator line L to itself, T must also take the side of L with positive 
labels to itself, so T preserves orientation of the plane exactly when it preserves ori- 
entation of L. Thus the only orientation-preserving symmetries of the topograph are 
translations along the separator line, and the only orientation-reversing symmetries 
are the two kinds of reflections across lines perpendicular to L. 

If the separator line of a hyperbolic form has a mirror symmetry then because of 
periodicity there has to be at least one reflector line in each period, but in fact there are 
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exactly two reflector lines in each period. To see this, let T be the translation by one 
period and let R} be a reflection across a reflector line L,. Consider the composition 
TR,, reflecting first by R, then translating by T, so TR, is an orientation-reversing 
symmetry. If L, is the line halfway between Lı and T(L,) then T(R,(L>)) = L) as 
we can see in the first figure below: 


R(L) L, L, TL) i L, BERGER) 
II 


TR (L) 

Thus TR, is an orientation-reversing symmetry that takes L, to itself while preserving 
the positive and negative sides of the separator line, so TR, must be a reflection R» 
across L,. This shows that there are at least two reflector lines in each period. There 
cannot be more than two since if R, and R, are the reflections across two adjacent 
reflector lines L,; and L, as in the second figure then the composition R,Rj, first 
reflecting by R, then by R>, is orientation-preserving and sends L, to Rọ(Rı(Lı)) = 
R5(L,) so this composition is a symmetry translating the separator line by twice the 
distance between L, and L,. The distance between L, and L, must then be half the 
length of the period, otherwise if the translation R,R, were some power T” of the 
basic periodicity translation T with |n| > 1, there would be fewer than two reflector 
lines in a period. 

For completeness let us also describe the symmetries for the remaining two types 
of forms besides elliptic and hyperbolic forms. For a 0-hyperbolic form, if the two 
regions labeled 0 in the topograph have a border edge in common then a symmetry 
must take this edge to itself, and it cannot interchange the ends of the edge since pos- 
itive values must go to positive values. The only possibility is then a reflection across 
this edge, which is always a symmetry of the topograph. If the two O-regions do not 
have a common border edge they are joined by a finite separator line and a symmetry 
must take this line to itself, without interchanging the positive and negative sides. 
The only possibility is then a reflection across a line perpendicular to the separator 
line and passing through its midpoint. This reflection gives a symmetry only when 
the finite continued fraction associated to the form is palindromic. 

A parabolic form has a single 0-region in its topograph, so the bordering line for 
this region must be taken to itself by any symmetry. Every symmetry of this bordering 
line gives a symmetry of the form, either a translation along the line or a reflection 
across a perpendicular line. 


The preceding analysis shows in particular the following fact: 
Proposition 5.8. All orientation-reversing symmetries of the topograph of a form 


are mirror symmetries, reflecting across a line that is either perpendicular to or 
contains an edge of the topograph. 
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Traditionally, a form whose topograph has an orientation-reversing symmetry is 
called ambiguous although there is really nothing about the form that is ambiguous 
in the usual sense of the word, unless perhaps it is the fact that such a form is indis- 
tinguishable from its mirror image. 


Let us define the symmetric class number hå to be the number of equivalence 
classes of primitive forms of discriminant A with mirror symmetry. Recall that equiv- 
alence is the same as proper equivalence for forms with mirror symmetry. The or- 
dinary class number hy, is thus hå plus twice the number of equivalence classes of 
primitive forms without mirror symmetry. We have h, > hå, and in fact h, is always 
an integer multiple of hå as we will see in Proposition 7.16. 

In contrast with h, it is possible to compute hi explicitly. Here is the result for 
elliptic and hyperbolic forms: 


Theorem 5.9. If A is a nonsquare discriminant and k is the number of distinct 
prime divisors of A then hi = 2*-1 except in the following cases: 

(a) If A =4(4m +1) then hi = 2*°?. 

(b) If A= 32m then hå = 2*. 


For example, for the discriminants A = 60 = 3-4-5 and A = —260 = -4-5-13 
that we looked at in the previous section the number of distinct prime divisors is 
k = 3 so the theorem says there are 2° = 4 equivalence classes of mirror symmetric 
forms in these two cases since the exceptional situations in (a) and (b) do not occur 
here and all forms of these two discriminants are primitive. This agrees with what the 
topographs showed. 

The proof of the theorem will involve considering a number of different cases. 
Fortunately most of the resulting complication disappears in the final answer. 


Proof: By Proposition 5.6 every form with mirror symmetry is equivalent to a form 
ax? +cy* or ax? +axy+cy*. The strategy will be to count how many of these 
special forms there are that are primitive with discriminant A, then determine which 
of these special forms are equivalent. 

For counting the special forms ax* +cy* and ax? +axy+cy* we may assume 
a > O since a is the value of the form when (x,y) = (1,0) and for elliptic forms 
we only consider those with positive values, while for hyperbolic forms we are free to 
change a form to its negative so it suffices to count only those with a > 0 and then 
double the result. 


Case 1: Forms ax? +cy°. Then A = —4ac = 46 for 6 = —ac. Primitivity of the form 
is equivalent to a and c being coprime. The only way to have coprime factors a and 
c of 6 = —ac is to take an arbitrary subset of the distinct primes dividing 6 and let 
a be the product of these primes each raised to the same power as in ô (so a = 1 
when we choose the empty subset). The number of such subsets is 2" where k’ is the 
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number of distinct prime divisors of 6, so there are 2K primitive forms ax? + cy? 
with a > 0. 


Case 2: Forms ax? + axy + cy? with A odd. We have A = a* — 4ac so A and a 
have the same parity. From A = a(a — 4c) we see that a divides A. We claim that 
each divisor a of A gives rise to a form ax? +axy +cy° of discriminant A. Solving 
A = a° —4ac for c gives c = (a*—A)/4a. The numerator is divisible by 4 since a and 
A are odd and hence a? and A are both 1 mod 4, making the numerator 0 mod 4. 
The numerator is also divisible by a if a divides A. Since 4 and a are coprime when 
a is odd, it follows that 4a divides the numerator so c is an integer and we get a 
form ax? +axy+cy° of discriminant A. This form is primitive exactly when a and 
c are coprime. This is equivalent to saying that the two factors of A = a(a — 4c) are 
coprime since any divisor of a and c must divide the two factors, and conversely any 
divisor of the two factors must divide a and 4c, hence also c since this divisor of 
the odd number a must be odd. As in Case 1, the only way to obtain a factorization 
A = a(a—4c) with the two factors coprime is to take an arbitrary subset of the distinct 
primes dividing A and let a be the product of these primes each raised to the same 
power as in A. The number of such subsets is 2* so this is the number of primitive 
forms ax? + axy + cy? with a > 0 when A is odd. 


There remain the forms ax? + axy+cy* with A = 46. Again A and a have the 

same parity since A = a° — 4ac, so a is even, say a = 2d. From A = a° — 4ac we 
then have 6 = d* — 2dc = d(d — 2c). 
Case 3: Forms ax* + axy +cy° with A = 46 and a = 2d for odd d. By primitivity 
c must be odd. The two factors of ô = d(d — 2c) are odd and must be distinct 
mod 4 since c is odd. Thus one factor is 1 mod 4 and the other is 3 mod 4, so 
ô = 3 mod 4, say ô = 4m + 3. We claim that when 6 = 4m + 3, each divisor d 
of ô gives rise to a form ax? + axy + cy? with a = 2d. To show this, note first 
that d must be odd since it divides 6 which is odd. Solving ô = d(d — 2c) for c 
gives c = (d? — 5)/2d. Since d and 6 are odd, the numerator d°? — 6 is even hence 
divisible by the 2 in the denominator. The numerator is also divisible by the d in 
the denominator if d divides 6. Since d is odd, this implies that 2d divides the 
numerator, so c is an integer for each divisor d of 6. In fact c is an odd integer since 
the numerator d° — 6 is 2 mod 4 and so cd = (d°? — 5)/2 is odd, forcing c to be 
odd. For the form ax? +axy +cy* to be primitive means that a and c are coprime. 
Since c is odd and a = 2d this is equivalent to c and d being coprime. This in turn is 
equivalent to the two factors of ô = d(d — 2c) being coprime since c and d are odd. 
Thus when 6 = 4m + 3 we get a primitive form ax? + axy + cy? for each choice of 
a subset of the distinct prime divisors of 6 since this determines d as before, and d 
determines c and a. The number of primitive forms ax? + axy + cy? is then 2* 
when A is even and a = 2d with d odd, where k’ is the number of distinct prime 
divisors of 6 as in Case 1. 
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Case 4: Forms ax? + axy +cy° with A even and a = 2d for even d, say d = 2e. 
Then 6 = d(d — 2c) = 4e(e —c). Since c is odd by primitivity of the form, the two 
factors e and e — c of 6 = 4e(e — c) have opposite parity, hence 6 must be divisible 
by 8, say 6 = 8m. We need to determine which choices of e and c yield primitive 
forms ax? + axy + cy’. Let 6’ = O/4 = 2m so the equation ô = 4e(e — c) becomes 
5’ = e(e—c). Thus e must divide 5’. We have c = e — 5/, and this will be an integer 
if e divides 5’. From the equation c = e — 5/ we see that any divisor of two of the 
three terms c, e, and O/o will divide the third. In particular, c and e will be coprime 
exactly when e and O/o are coprime. Since 6’ = e- YA this means we want to choose 
e by choosing some subset of the distinct prime divisors of 6’ and letting e be the 
product of these primes raised to the same powers as in 6’. Then e and O/o will be 
coprime and of opposite parity since they are not both even and their product 6’ is 
even. Their difference c = e — 5/ will then be odd. Also, c and e will be coprime 
so c and a = 4e will be coprime, making the form ax* + axy + cy? primitive. The 
number of distinct prime divisors of 5’ is the same as for 6 = 46’ since 6’ is even. 
Thus in Case 4 the number of primitive forms ax? + axy + cy? with a > 0 is 2*. 


Note that k’ = k when ô is even and k’ = k — 1 when ô is odd. By combining 
the four cases above and remembering to double the number of forms when A > 0 
to account for negative coefficients of x”, we then obtain the following table for the 
number of forms of either of the types ax? + cy? or ax? +axy+cy?: 


A odd 46,6=4m+1 46,6=4m+3 

Cases (2) (1) (1) and (3) 

ASO: || :2" 2S oe aera 

A > 0 pk+1 ok'+1 = ok ok +1 + ok +1 — ok'+2 = okt 
A 46,6=8m 46, 6 even, 6 + 8m 

Cases (1) and (4) (1) 

A<0 BP Sei OEE Ont Bek 

A > 0 gk +1 + ok +1 — ok'+2 = gkte ok +1 2 okt 


Comparing the results in the table with the statement of the theorem, we see that the 
proof will be finished when we show that under the relation of equivalence the special 
forms split up into pairs when A < 0 and into groups of four when A > 0. 

Two easy cases that can be disposed of first are A = —3 and A = —4. Here all 
forms are equivalent and are primitive, and k = 1, so the theorem is true since the 
exceptional cases (a) and (b) in the statement of the theorem do not apply. 

Our earlier analysis of symmetries of elliptic and hyperbolic forms shows that the 
only time that reflector lines can intersect is for elliptic forms equivalent to ax? +ay* 
or ax? + axy + ay’, so when we restrict to primitive forms this means A = —3 or 
A = —4. Thus we may assume from now on that reflector lines do not intersect. 
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For a form ax? + cy* with a reflector line a+c 
perpendicular to an edge of the topograph as in 
the first figure at the right we have a + c, oth- č —— 


erwise there would be two intersecting reflector 
lines. Thus the reflector line corresponds to two 


distinct special forms, ax° +cy* and cx? +ay°. CE 


The second figure shows the case of a form with a reflector line containing an edge of 
the topograph. This edge corresponds to a form ax* + bxy + ay? and the adjacent 
edges correspond to two forms dx? + dxy + ay? and ex? +exy + ay” of the type 
ax? +axy+cy°. These two forms are distinct since if d = e there would be a second 
reflector line intersecting the first one. Thus the reflector line accounts for two special 
forms ax* + axy +cy°. 

Primitive elliptic forms with mirror symmetry and A + —3,-4 have just one 
reflector line, so each equivalence class of such forms contains exactly two special 
forms. For hyperbolic forms with mirror symmetry there are two reflector lines in 
each period, with one pair of special forms for each reflector line. These two pairs 
give four distinct special forms, otherwise there would be a translational symmetry 
taking one reflector line to the other within a single period, which is impossible. Thus 
each equivalence class of mirror-symmetric hyperbolic forms contains exactly four 
special forms, and the proof is complete. o 


We illustrate the theorem with an example, the first negative discriminant with 
four distinct prime divisors, A = —420 = —3-4-5-7. In this case A = 4(4m + 3) so 
the theorem says there are 2° = 8 equivalence classes of symmetric primitive forms. 
If we compute all the reduced forms for A = —420 by the method in Section 5.2 we 
get the following table, with the letter b replacing h so we are finding solutions of 
b? + 420 = 4ac with 0 < b < a < c. The entries [a,b,c] in the last column give the 
reduced forms ax? + bxy +cy?. 


b | ac (a,c) [a,b,c] 


O | 105 | (1,105) [1,0,105] 
(3,35) [3,0,35] 
(5,21) [5,0,21] 
(7,15) [7,0,15] 

2 | 106 | (2,53) [2,2,53] 

4 | 109 | — 

6 | 114 | (6,19) [6,6,19] 

8 | 121 | (11,11) [11,8,11] 

O | 130 | (10,13) [10,10,13] 


Thus all forms of discriminant —420 are symmetric. The first four have b = 0 so 
these arise in Case 1 in the proof of the theorem where we set A = 4ô, so 6 = 
-3-5-7 and we get a form [a,0,c] for each positive divisor a of ô, the eight numbers 
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1,3,5,7,15,21,35,and 105. These forms [a,0,c] are the first four entries in the last 
column of the table along with the equivalent forms obtained by reversing a and c. 
The remaining four forms in the last column have b nonzero and are instances of 
forms [a,a,c] and [a,b,a]. The relevant parts of the topographs of these four forms 
are shown in the figure to the right of the table. Each edge in the figure gives a form 
[a, b,a], [a,a,c], or [a,c,c]. For example the third figure gives the forms [11, 8,11], 
[11,14,14], [14,14,11], [11,30,30], and [30,30,11]. In the proof of the theorem 
we were only counting the forms [a,a,c], not [a,b,a] or [a,c,c]. According to 
Case 3 in the proof of the theorem the numbers a in the forms [a,a,c] should be 
twice the numbers a in the forms [a,0,c], and they are: 2 = 2-1, 6 = 2-3, 10 = 2-5, 
14 = 2-7, 30 = 2-15, 42 = 2-21, 70 = 2-35, and 210 = 2-105. 


Corollary 5.10. The nonsquare discriminants A with hå = 1 are A = —4, +8, -16, 
+p? and +4p**t! for odd primes p with p = 1 mod 4 when A > 0 and p = 3 
mod 4 when A < 0. In particular, the only fundamental discriminants with hi = 1 
are A = —4, +8, and +p for odd primes p, with p = 1 mod 4 when A > 0 and 


p =3 mod 4 when A <0. 


Proof: Consider first the case A > 0. If we are not in one of the exceptional cases (a) 
and (b) in Theorem 5.9 then A must have just one distinct prime divisor so it must be 
a power of a prime, in fact an odd power if it is not a square. Thus for p odd we have 
A = p***! and we must have p = 1 mod 4 in order to have A = 1 mod 4. For odd 
powers of p = 2 the only possibility is A = 8 since A cannot be 2 and odd powers 
beyond 8 are of the form A = 32m, the exceptional case (b) where h = 2 so this is 
ruled out as well. In the exceptional case (a) we have A = 4(4m +1) with 4m +1a 
prime power p°**! with p = 1 mod 4 since A = 4p** is a square. 


2k 


When A < 0 the reasoning is similar, the main difference being that —p** and 


SApe* are ruled out, not because squares are excluded, but because pe is always 1 


2k 


mod 4 when p is odd, so Eek is 3 mod 4. This rules out -p^* as a discriminant, 


and it rules out —4p°k being an exceptional case A = 4(4m + 1). 


Requiring A to be a fundamental discriminant eliminates the cases A = —16 and 


k+l and restricts the exponent in +p**t! tobe 1. o 


+4p 

We have mentioned the fact that h, is always a multiple of hi, which will be 
proved in Proposition 7.17. This tells us nothing about h, when hj = 1, but we will 
also prove that hÀ = 1 exactly when h, is odd. Thus the preceding corollary gives a 
way to determine whether h; is even or odd. In the examples we have looked at so 
far h, has been either 1 or even, but odd numbers greater than 1 can also occur as 
class numbers. The table below gives some examples for negative discriminants, so 
we are finding the solutions of h? + |A| = 4ac with 0 < h <a < c as usual. 
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A h | ac (a,c) hy 
-23 [ile [6,23 a 
—47 1 12 (1,12), (2,6), (3,4) 5 
3 | 20 (4,5) 
-199 | 1 (1,50), (2,25), (5, 10) 9 
3 (4,13) 
(7,8) 
7 == 
—-167 | 1 (1,42), (2,21), (3, 14), (6, 7) 11 
3 (4,11) 
5 (6,8) 
7 — 
-191 | 1 (1,48), (2, 24), (3, 16), (4, 12), (6, 8) 13 
3 (5,10) 
5 (6,9) 
7 
-239 | 1 15 
3 | 62 |— 
5 | 66 (6,11) 
7 | 72 (8,9) 


The examples in the table are all fundamental discriminants, and in each case they 
are the first negative discriminant with the given class number. 

Besides the cases when h = 1, another nice situation is when h, = hå so all 
primitive forms of discriminant A have mirror symmetry. We call such discriminants 
fully symmetric. As we will see in the following chapters, forms with fully symmetric 
discriminants have very special properties. A table at the end of the book lists the 101 
known negative discriminants that are fully symmetric, ranging from —3 to —7392. 
Of these, 65 are fundamental discriminants, the largest being —5460. Since 5460 
factors as 3-4-5-7-13 with five distinct prime factors, Theorem 5.9 says that hå = 
2* = 16. This is in fact the largest value of hå among the 101 discriminants in the list. 
Computer calculations have extended to much larger negative discriminants without 
finding any more that are fully symmetric. It has not yet been proved that no more 
exist, although it is known that there are at most two more. For positive discriminants 
there are probably infinitely many that are fully symmetric since it is likely that there 
are already infinitely many with h, = 1. 
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Among the examples of hyperbolic forms we have considered there were some 
whose topograph had a “symmetry” which was a glide reflection along the separator 
line that had the effect of changing each value to its negative rather than preserving 
the values. These are not actual symmetries according to the definition we have given, 
so let us call such a transformation that takes each value of a form to its negative a 
skew symmetry. (Compare this with skew-symmetric matrices in linear algebra which 
equal the negative of their transpose.) 


A skew symmetry must take the separator line to itself while interchanging the 
two sides of the separator line, so it either translates the separator line along itself and 
hence is a glide reflection, or it reflects the separator line, interchanging its two ends 
as well as the two sides of the separator line, making it a 180 degree rotation about 
a point of the separator line. Examples of forms with this sort of skew symmetry 
occurred in Chapter 4, the forms x° — 13? and 10x? — 29y”. 


The figures below show forms whose separator lines have all the possible combi- 
nations of symmetries and skew symmetries. 


7x?+5xy -7y? 


3x°+8xy-7y? 


5x*+14xy -10y? 


-10 7 18}—-19] -10 7 18 |-19 


The first form has all four types: translations, mirror symmetries, glide reflections, 
and rotations. The next three forms have only one type of symmetry or skew symmetry 
besides translations, while the last form has only translational symmetries and no 
mirror symmetries or skew symmetries. It is not possible to have two of the three 
types of nontranslational symmetries and skew symmetries without having the third 
since the composition of two of these three types gives the third type. One can see 
this by considering the effect of a symmetry or skew symmetry on the orientation of 
the plane and the orientation of the separator line. The four possible combinations 
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distinguish the four types of transformations according to the following chart, where a 
plus sign means orientation-preserving and a minus sign means orientation-reversing. 


plane orientation | line orientation 


translation + + 
rotation + — 
glide reflection — + 
reflection — — 


A rotational skew symmetry is a rotation about the midpoint of an edge of the 
separator line where the two adjacent regions have labels a and —a. If the edge 
separating these two regions has label b then the form associated to this edge is 
ax’ + bxy — ay’. Conversely, any form ax? + bxy — ay? whose discriminant 
A = b° +4a’ is not a square (although it is the sum of two squares) will be a hyperbolic 
form having a rotational skew symmetry, as one can see in the 
figure at the right. Note that the form ax? + bxy — ay’ will be b 


one of the reduced forms in the equivalence class of the given form _p _a 
since the two edges leading off the separator line at the ends of the 
edge labeled b do so on opposite sides of the separator line. Thus rotational skew 
symmetries can be detected by looking just at the reduced forms. The same is true for 
mirror symmetries and glide reflection skew symmetries, but for these one must look 
at the arrangement of the whole cycle of reduced forms rather than just the individual 
reduced forms. 

For rotational skew symmetries there are two rotation points along the separator 
line in each period, just as reflector lines occur in pairs in each period. 


Exercises 


1. Show that the number of symmetries of an elliptic form, including the identity 
transformation, is 1, 2, 4, or 6. 


2. Show that the number of equivalence classes of forms of discriminant 45 with 
mirror symmetry is not a power of 2 if nonprimitive as well as primitive forms are 
allowed. (Compare this with Theorem 5.9.) 


3. In the text an example was given of a hyperbolic form having only translational 
symmetries and no skew symmetries, the form 5x* + 14xy — 10y*. Find another 
example of the same sort which is not equivalent to this form or a constant times it. 
Hint: First find a separator line with the desired properties, without any labels along 
the line, then find a form realizing that separator line. 


4. Show that a positive nonsquare number is the discriminant of some hyperbolic 
form whose topograph has a rotational skew symmetry if and only if the number is 
the sum of two squares at least one of which is even. 
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5. Verify that the following discriminants are fully symmetric, so all primitive forms 
of that discriminant have mirror symmetry: 


(a) -195 (b) —660 (c) 195 


6. Show that the topograph of a primitive 0-hyperbolic form qxy — py? has mirror 
symmetry exactly when p° = 1 mod q, and has rotational skew symmetry exactly 
when p° = —1 mod 4q. (See the discussion in Chapter 2 about the relation between 
the continued fraction for P/g and the continued fraction obtained by reversing the 
order of the terms.) 


9.) Charting All Forms 


We have used the Farey diagram to study individual quadratic forms through 
their topographs, and in this section we will see that the Farey diagram also appears 
in another way when one seeks a global picture of all forms simultaneously. This 
viewpoint will not play an essential role in later chapters, however, so this section can 
be regarded as something of a digression from the main line of the book. 


Quadratic forms are defined by formulas ax? + bxy + cy“, and our point of 
view will be to regard the coefficients a, b, and c as parameters that vary over 
all integers independently. It is natural to consider the triples (a,b,c) as points 
in 3-dimensional Euclidean space R*, and more > b 
specifically as points in the integer lattice Z? con- PEN 


XY 


sisting of points (a,b,c) whose coordinates are x2+xy 
integers. We will exclude the origin (0,0,0) since 


this corresponds to the trivial form that is iden- x?’+xy +y? xy+y* 


tically zero. Instead of using the usual (x,y,z) 


as coordinates for R? we will use (a,b,c), but 
since a and c play a symmetric role as the coef- 
ficients of the squared terms x° and y* we will 
position the a-axis and the c-axis in a horizontal 
plane, with the b-axis vertical, perpendicular to 


x?-xy+y? $ 


the ac-plane. 

Along a ray starting at (0,0,0) and passing through another lattice point (a, b,c) 
there are infinitely many lattice points (ka, kb, kc) for positive integers k. If a, b, and 
c have a greatest common divisor larger than 1 we can cancel this common divisor 
to get a primitive triple (a,b,c) corresponding to a primitive form ax? +bxy+cy?. 
Then all the other lattice points on the ray through (a,b,c) are the positive integer 
multiples (ka, kb, kc), corresponding to the nonprimitive forms kax?+kbxy+kcy*. 
Thus primitive forms correspond exactly to rays from the origin passing through 
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lattice points. These are the same as rays passing through points (a, b,c) with rational 
coordinates since denominators can always be eliminated by multiplying a, b, and c 
by a common denominator. 

Since the discriminant A = b* — 4ac plays such an important role in the clas- 
sification of forms, let us see how this fits into the picture in (a,b,c) coordinates. 
When b° — 4ac is zero we have the special class of parabolic forms, and the points 
in R? satisfying the equation b° — 4ac = 0 form a double cone with the common 
vertex of the two cones at the origin. The double b 
cone intersects the ac-plane in the a-axis and a 
the c-axis. The central axis of the double cone is 
the line a = c in the ac-plane. Points (a,b,c) 
inside either cone have b° — 4ac < 0 so the lat- C [7 Dr 
tice points inside the cones correspond to elliptic 
forms. Positive elliptic forms have a > 0 and c > 0 so they lie inside the cone pro- 
jecting to the first quadrant of the ac-plane. We call this the positive cone. Inside the 
other cone are the negative elliptic forms, those with a < 0 and c < 0. Outside the 
cones is a single region consisting of points with b* — 4ac > 0 so the lattice points 
here correspond to hyperbolic forms and 0-hyperbolic forms. 

If one slices the positive cone via the vertical plane a + c = 1 perpendicular to 
the axis of the cone then the intersection of the cone with this plane is an ellipse 
which we denote E. The top and bottom 
points of E are (a,b,c) = (%,+1,'/) so 
its height is 2. The left and right points of E 
are (1,0,0) and (0,0,1) so its widthis v2. 
Thus E is somewhat elongated vertically. If 


we wanted, we could compress the vertical 
coordinate to make E a circle, but there is 


no special advantage to doing this. 

If we take a lattice point (a,b,c) corresponding to a primitive positive elliptic 
form and project this lattice point along the ray to the origin passing through (a,b,c), 
this ray intersects the plane a+c = 1 inthe point (4/940, "/gic,/a+c) since the sum 
of the first and third coordinates of this point is 1. This point lies inside the ellipse 
E and has rational coordinates. Conversely, every point inside E with rational coor- 
dinates is the radial projection of a unique primitive positive elliptic form, obtained 
by multiplying the coordinates of the point by the least common multiple of their de- 
nominators. Thus the rational points inside E parametrize primitive positive elliptic 
forms. We will use the notation [a,b,c] to denote both the form ax? + bxy + cy" 
and the corresponding rational point (4/g4¢,/qg+c,/a+c) inside E. The figure be- 
low shows some examples, including a few parabolic forms on E itself. 
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y= by = 2 b/ -3 


b = 3⁄2 Ve va 
[12,1] 


bp = 1) 116,24,9] (9,24,16] by = 1, 


[4,6,3] [3.6,4] 


[232] 


[1,4,4] 


[4,4,1] 


[2,2,1]* x[1,2,2] 


Ya = 2 [332 12,333] REN 
[16,8,1] [4241 ALÀ eg [1,8,16] 
[532] Abn S AK 
[2,1,2] 
2,0,1 1,0,2 
ai [2,0,1] [1,0,2] ii 


[3,0,1] [1,0,1] [1,0,3] 


In the figure the lines radiating out from the points [1,0,0] and [0,0,1] consist of 
the points [a, b,c] witha fixed ratio 2/, or 9/.. The ratios %/ are fixed along vertical 
lines. Two out of three of these ratios determine the third since 2/, -% = %/. 

Of special interest are the reduced primitive elliptic forms [a,b,c], which are 
those satisfying 0 < b < a < c where a, b, and c have no common divisor. These 
correspond to the rational points in the shaded triangle in the figure above, with 
vertices [1,1,1], [1,0,1], and [0,0,1]. The edges of the triangle correspond to one 
of the three inequalities 0 < b < a < c becoming an equality, so b = O for the lower 
edge, a = c for the vertical edge, and a = b for the hypotenuse. Thus the three edges 
correspond to the reduced forms with mirror symmetry, the forms [a,0,c] for the 
bottom edge, [a,b,a] for the left edge, and [a, a,c] for the diagonal edge. Points in 
the interior of the triangle correspond to forms without mirror symmetry. 


Just as rational points inside the ellipse E correspond to primitive positive elliptic 
forms, the rational points on E itself correspond to primitive positive parabolic forms. 
As we know, every parabolic form is equivalent to the form ax* for some nonzero 
integer a. For this to be primitive means that a = +1, so every positive primitive 
parabolic form is equivalent to x*. Equivalent forms are those that can be obtained 
from each other by a change of variable replacing (x, y) by (px + qy,vrx + sy) for 
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some integers p,q,7,s satisfying ps —qr = +1. For the form x° this means that the 
primitive positive parabolic forms are the forms (px + qv)? = p°x* +2pqxy+q°y" 
for any pair of coprime integers p and q. In [a,b,c] notation this is [p*,2pq,q°], 
defining a point on the ellipse E. 


More concisely, we could label the rational point on E corresponding to the form 
(px + qy)? just by the fraction P/,. Thus at the left and right sides of E we have 
the fractions o and % corresponding to the forms x? and y*, while at the top and 
bottom of E we have Y, and 71⁄4 corresponding to (x+y)? and (x-y)? = (—x+y)°*. 


2 
(4x +39)? ery) (3x +4y)* 


(3x + 2y)° N f (2x +3y)° 
(5x +39)" a7 4 + 3 4 (8x+ 5y)" 

(2x4 y)? N52 3 4 337 ete 
x+y 7 : l = ext 5)" 
x+y) = a (+BY) 

1 


ER i (x +4y)° 


2 
(axe yy)” 4 
1 


2 4 2 
GEV) es. 5 1 or (X+5y) 


1 


xo e + E 
Note that changing the signs of both p and q does not change the form (px + qy)? 
or the fraction P/g. In the first quadrant of the ellipse the fractions P/g increase 
monotonically from % to 14 since the ratio 2/ equals 2P and b is increasing 
while c is decreasing so ary is increasing, and hence so is P/g. Similarly in the 
second quadrant the values of ¥/g increase from 1/ to Y since we have 2/, = 24/y 
which decreases as b decreases and a increases. In the lower half of the ellipse we 
have just the negatives of the values in the upper half since the sign of b has changed 
from plus to minus. 


This labeling of the rational points of E by fractions ?/g seems very similar to the 
labeling of vertices in the circular Farey diagram. As we saw in Section 1.1, if the Farey 
diagram is drawn with !/ at the top of the unit circle in the x y-plane, then the point 
on the unit circle labeled ?/, has coordinates (x,y) = (2P4/p2492,P°-1/y24q2). 
After rotating the circle to put 1% on the left side by replacing (x, y) by (—y,x) 
this becomes (4° -P?/y24 92 ,°P4/,2492). Here the y-coordinate *P4/,2,g2 is the 
same as the b-coordinate of the point of E labeled P/g, which is the point (a,b,c) = 
(Pp: +q? PPA yrs q? KoA q2). Since the vertical coordinates of points in either 
the left or right half of the unit circle or the ellipse E determine the horizontal coor- 
dinates uniquely, this means that the labeling of points of E by fractions P/g is really 
the same as in the circular Farey diagram. 
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Let us return now to the general picture of how forms ax? + bxy + cy” are 
represented by points (a,b,c) in R?. As we know, a change of variables by a linear 
transformation T sends (x,y) to T(x,y) = (px + qy,rx + sy), where p,q,71r,5s 
are integers with ps — qr = +1. This change of variables transforms each form into 
an equivalent form. To see the effect of this change of variables on the coefficients 
(a,b,c) of a form Q(x, y) = ax? + bxy+cy* we doa simple calculation: 


Q(px +qy,rx +sy) =al(px+qy)* +b(px +ay)(rx +sy) +c(rx + sy)? 
= (ap? + bpr +cr*)x* + (2apq + bps + bar + 2crs)xy 
+ (aq? + bas +cs*)y* 


This means that the (a,b,c) coordinates of points in R? are transformed according 
to the following formula: 


T* (a,b,c) = (p°a + prb +r°c,2pqa + (ps + qr)b + 2rsc,q’at+qsb + s°c) 


For fixed values of p,q,r,s this T* is a linear transformation of the variables a,b,c. 
Its matrix is: 


p° pr r’ 
2pq ps+qr 2rs 
q’ qs s? 


Since T* is a linear transformation, it takes lines to lines and planes to planes, but T* 
also has another special geometric property. Since equivalent forms have the same 
discriminant, this means that each surface defined by an equation b? — 4ac = k for k 
a constant is taken to itself by T*. In particular, the double cone b° —4ac = 0 is taken 
to itself, and in fact each of the two cones separately is taken to itself since one cone 
consists of positive parabolic forms and the other cone of negative parabolic forms (as 
one can see just by looking at the coefficients a and c), and positive parabolic forms 
are never equivalent to negative parabolic forms. When k > 0 the surface b*—4ac = k 
is a hyperboloid of one sheet and when k < 0 it is a hyperboloid of two sheets. In the 
case of two sheets the lattice points on one sheet give positive elliptic forms and the 
lattice points on the other sheet give negative elliptic forms. 

Since T* takes lines through the origin to lines through the origin and the double 
cone b? —4ac = 0 to itself, this means that T* gives a transformation of the ellipse E 
to itself, taking rational points to rational points since rational points on E correspond 
to lattice points on the cones. Regarding E as the boundary circle of the Farey diagram, 
we know that linear fractional transformations give symmetries of the Farey diagram, 
also taking rational points on the boundary circle to rational boundary points. And 
in fact, the transformation of this circle defined by T* is exactly one of these linear 
fractional transformations. This is because T* takes the parabolic form (dx+ey)? to 
the form (d(px+qy) +e(rx+sy))* = ((dpt+er)x+(dq+es)y)°* so in the fractional 
labeling of points of E this says T*(4/,) = P4+ "€/qad+se Which is a linear fractional 
transformation. If we write this using the variables x and y instead of d and e it 
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would be T* (*/y)) = P**™¥/ox453,. This is not quite the same as the linear fractional 
transformation T(*/) = P**4¥/,.45, defined by the original change of variables 
T(x,y) = (px +qy,rx +sy), but rather T* is obtained from T by transposing the 
matrix of T, interchanging the off-diagonal terms q and r. 


Via radial projection, the transformation T* determines a transformation not just 
of E but also of the interior of E in the plane a +c = 1. This transformation, which 
we still call T* for simplicity, takes lines inside E to lines inside E since T* takes 
planes through the origin to planes through the origin. 
This leads us to consider a linear version of the Farey 
diagram in which each circular arc of the original Farey 
diagram is replaced by a straight line segment joining 
the two endpoints of the circular arc. These line seg- 
ments divide the interior of E into triangles, just as the 
original Farey diagram divides the disk into curvilinear 
triangles. The transformation T* takes each of these tri- 
angles onto another triangle, analogous to the way that 
linear fractional transformations provide symmetries of 
the original Farey diagram. 


Suppose we divide each triangle of the linear Farey diagram into six smaller trian- 
gles as in the figure at the right, by adding diagonals to each quadrilateral formed by 
two adjacent triangles of the Farey diagram. The trans- 
formation T* takes each of these small triangles onto 
another small triangle since it takes lines to lines. One 
of these small triangles is the triangle defined by the in- 
equalities 0 < b < a < c that we considered earlier. The 
fact that every positive primitive elliptic form is equiv- 
alent to exactly one reduced form, corresponding to a 
rational point in this special triangle, is now visible ge- 
ometrically as the fact that there is always exactly one 
transformation T* taking a given small triangle to this 
one special small triangle. 


Elliptic forms whose topograph contains a source edge are equivalent to forms 
ax* + cy* so these are the forms corresponding to rational points on the edges of 
the original linear Farey diagram, before the subdivision into smaller triangles. These 
are the forms whose topograph has a symmetry reflecting across a line perpendicular 
to the source edge. (This line is just the edge in the Farey diagram containing the 
given form.) The other type of reflectional symmetry in the topograph of an elliptic 
form is reflection across an edge of the topograph. Forms with this sort of symmetry 
correspond to rational points in the dotted edges in the preceding figure, the edges 
we added to subdivide the Farey diagram into the smaller triangles. The dotted edges 
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are of two types according to whether the two equal values of the form in the three 
regions surrounding the source vertex occur for the smallest value of the form (wide 
dotted edges) or the next-to-smallest value of the form (narrow dotted edges). Note 
that the wide dotted edges form the dual tree of the Farey diagram. 


Let us turn our attention now to hyperbolic and 0-hyperbolic forms, which cor- 
respond to integer lattice points that lie outside the two cones. As a preliminary 
observation, note that for a point (a,b,c) outside the double cone there are exactly 
two planes in R? that are tangent to the double cone and pass through (a,b,c). 
Each of these planes is tangent to the double cone along a line through the origin. 
The two tangent planes through (a,b,c) are 
determined by their intersection with the plane 
a +c = 1, which consists of two lines tangent 
to the ellipse E. These two lines can either in- 
tersect or be parallel. The latter possibility oc- 
curs when the point (a,b,c) lies in the plane 
a+c = (0, so the two tangent planes intersect 
in a line in this plane. For example, if the point 
(a,b,c) we start with happens to lie on the b-axis, then the tangent planes are the 
ab-plane and the bc-plane. These intersect the plane a + c = 1 in the two vertical 
tangent lines to the ellipse E. 

Our goal will be to show the following: 


Proposition 5.11. Let Q(x, y) = ax*+bxy+cy" bea form of positive discriminant, 
either hyperbolic or 0-hyperbolic. Then the two points where the tangent lines to E 
determined by (a,b,c) touch E are the points diametrically opposite the two points 
that are the endpoints of the separator line in the topograph of Q in the case that 
Q is hyperbolic, or the two points labeling the regions in the topograph of Q where 
Q takes the value zero in the case that Q is 0-hyperbolic. 


Proof: We begin with a few preliminary remarks that will allow us to treat the hyper- 
bolic and 0-hyperbolic cases in the same way. A form Q(x, y) = ax? + bxy +cy* 
of positive discriminant can always be factored as (px +qy)(rx+sy) with p,q,r,s 
real numbers since if a = 0 we have the factorization y(bx + cy) andif a + 0 then 
the associated quadratic equation ax* +bx +c = 0 has positive discriminant so it has 
two distinct real roots « and £. This leads to the factorization ax? + bxy + cy? = 
a(x —ay)(x — By) which can be rewritten as (px +qy)(rx +sy) by incorporating a 
into either factor. If Q is hyperbolic then the discriminant is not a square and hence 
the factorization (px +qy)(rx +sy) will involve coefficients that are quadratic irra- 
tionals. If Q is 0-hyperbolic then the discriminant is a square so the roots « and £f 
are rational and we obtain a factorization of Q as (px + qy)(rx + sy) with rational 
coefficients. In fact we can take p,q,r,s to be integers in this case since we know 
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every 0-hyperbolic form is equivalent to a form y(bx + cy) so we can obtain the 
given form Q from y(bx+cy) by replacing x and y by certain linear combinations 
dx +ey and fx + gy with integer coefficients d,e, f,g. 

The points where the tangent planes touch the double cone correspond to forms 
of discriminant zero, with coefficients that may not be integers or even rational. A 
simple way to construct two such forms from a given form Q = (px + qy)(rx+sy) 
is just to take the squares of the two linear factors, so we obtain the forms (px +qy)* 
and (rx +sy)*, each of discriminant zero. We will show that each of these two forms 
lies on the line of tangency for one of the two tangent planes determined by Q. 

To do this for the case of (yx +qy)* we consider the line L in R? passing through 
the two points corresponding to the forms (px + qy)(rx+sy) and (px + qy)? . We 
claim that L consists of the forms 


Q(x, y) = (px + ay)|( -= t)(rx+sy)+t(px+ ay) | 


as t varies over all real numbers. When t = O or t = 1 we obtain the two forms 
Qo = (px + ay)(rx + sy) and Qı = (px + qy)? so these forms lie on L. Also, we 
can see that the forms Q, do form a straight line in R? by rewriting the formula for 
Q(x, y) as ax* + bxy +cy? with the coefficients a,b,c given by: 


(a,b,c) = (pr(1—t) + pt, (ps+qr)(1—-—t)+2pqt,qs(1—-—t)+ q‘t) 


This defines a line since p,q,7r,s are constants, so each coordinate is a linear function 
of t. Since the forms Q, factor as the product of two linear factors, they have non- 
negative discriminant for all t. This means that the line L does not go into the interior 
of either cone. It also does not pass through the origin since if it did, it would have 
to be a subset of the double cone since it contains the form Q, which lies in the 
double cone. From these facts we deduce that L must be a tangent line to the double 
cone. Hence the plane containing L and the origin must be tangent to the double cone 
along the line containing the origin and Q}. The same reasoning shows that the other 
tangent plane that passes through (px + qy)(rx + sy) intersects the double cone 
along the line containing the origin and (rx + sy)’. 

The labels of the points of E corresponding to the two forms (px + qy)? and 
(rx+sy)* are P/g and "/; according to the convention we have adopted. On the other 
hand, when the form (px + qy)(rx +sy) is hyperbolic the ends of the separator line 
in its topograph are at the two points where this form is zero, which occur when */, 
is 74/p and ~*/,. These are the negative reciprocals of the previous two points P/g 
and ’% so they are the diametrically opposite points in E. Similarly, when the form 
(px +qay)(rx+sy) is 0-hyperbolic the vertices of the Farey diagram where it is zero 
are at ~4/p and ~*/;, again diametrically opposite P/g and '/s. o 


It might have been nicer if the statement of the previous proposition did not 
involve passing to diametrically opposite points, but to achieve this we would have had 
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to use a different rule for labeling the points of E, with the label ?/g corresponding 
to the form (qx — py)? instead of (px +qy)*. This 180 degree rotation of the labels 
would put the negative labels in the upper half of E rather than the lower half, which 
does not seem like a good idea. 


Next let us investigate how hyperbolic and O0-hyperbolic forms are distributed 
over the lattice points outside the double cone b? —4ac = 0. This is easier to visualize 
if we project such points radially into the plane a + c = 1. This only works for forms 
ax*+bxy+cy* with atc > 0, but the forms with a+c < 0 are just the negatives of 
these so they give nothing essentially new. The forms with a + c = 0 will be covered 
after we deal with those with a +c > 0. 

Forms with a +c > 0 that are hyperbolic or 0-hyperbolic correspond via radial 
projection to points in the plane a +c = 1 outside the ellipse E. As we have seen, 
each such point determines a pair of tangent lines to E intersecting at the given point. 


For a 0-hyperbolic form (px + qy)(rx + sy) the points of tangency in E have 
rational labels P/g and '/;. We know that every 0-hyperbolic form is equivalent to 
a form y(rx + sy) with a = 0, so P/g = 0/, and one line of tangency is the vertical 
line tangent to E on the right side. The form y(rx + sy) corresponds to the point 
(0,7,s) in the plane a = O tangent to the double cone. Projecting radially into the 
vertical tangent line to E, we obtain the points (0,’/;,1), where "% is an arbitrary 
rational number. Thus 0-hyperbolic forms are dense in this vertical tangent line to E. 
Choosing any rational number ’/%, the other tangent line for the form y(rx + sy) is 
tangent to E at the point labeled 7/,. 


An arbitrary 0-hyperbolic form (px + qy)(rx + sy) is obtained from one with 
P/g = %/, by applying a linear fractional transformation T taking °/, to P/q, so the 
vertical tangent line to E at 9/; is taken to the tangent line at P/g, and the dense set of 
0-hyperbolic forms in the vertical tangent line is taken to a dense set of 0- hyperbolic 
forms in the tangent line at ¥/7. Thus we see that the 0-hyperbolic forms in the plane 
a+c = 1 consist of all the rational points on all the tangent lines to E at rational 
points P/g of E. 

In the case of a hyperbolic form ax*+bxy+cy* with a +c > 0 the two tangent 
lines intersect E at a pair of conjugate quadratic irrationals, the negative reciprocals of 
the roots « and & of the equation ax? + bx +c = 0. Since « determines & uniquely, 
one tangent line determines the other uniquely, unlike the situation for 0- hyperbolic 
forms whose rational tangency points ?/g and '/, can be varied independently. A 
consequence of this uniqueness for hyperbolic forms is that each of the two tangent 
lines contains only one rational point, the intersection point of the two lines. This is 
because any other rational point would correspond to another form having one of its 
tangent lines the same as for ax* + bxy + cy” and the other tangent line different, 
contradicting the previous observation that each tangent line for a hyperbolic form 
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determines the other. (The hypothetical second form would also be hyperbolic since 
the common tangency point for the two forms is not a rational point on E.) 

The points in the plane a +c = 1 that correspond to 0-hyperbolic forms are 
dense in the region of this plane outside E since for an arbitrary point in this region 
we can first take the two tangent lines to E through this point and then take a pair 
of nearby lines that are tangent at rational points of E since points in E with rational 
labels are dense in E. It is also true that points in the plane a +c = 1 that correspond 
to hyperbolic forms are dense in the region outside E. To see this we can proceed 
in two steps. First consider the case of a point in this region whose two tangent 
lines to E are tangent at irrational points of E. These two irrational points are the 
endpoints of an infinite strip in the Farey diagram that need not be periodic. However 
we can approximate this strip by a periodic strip by taking a long finite segment of 
the infinite strip and then repeating this periodically at each end. This means that the 
given point in the region outside E lies arbitrarily close to points corresponding to 
hyperbolic forms. Finally, a completely arbitrary point in the region outside E can be 
approximated by points whose tangent lines to E touch E at irrational points since 
irrational numbers are dense in real numbers. 

It remains to consider hyperbolic and 0-hyperbolic forms (px + qy)(rx + sy) 
corresponding to points (a,b,c) in the plane a +c = 0. Such a form determines 
a line through the origin in this plane, and the tangent planes to the double cone 
that intersect in this line intersect the plane a + c = 1 in two parallel lines tangent 
to E at two diametrically opposite points ?/g and ~4/,. This means that the form is 
(px+qy)(qx—-py), up to a constant multiple. If ¥/g is rational this is a 0- hyperbolic 
form. Examples are: 


= xy with vertical tangents to E at Y and %. 

= x°-y* = (x+ y)(x — y) with horizontal tangents to E at Y, and 71⁄4. 

= 2x*-3xy —2y° = (2x + y)(x — 2y) with parallel tangents at 2/4, and ~1/. 
If P/q and “9p are conjugate quadratic irrationals then we have a hyperbolic form 
ax? +bxy +cy? = a(x — «)(x — &) where «X = -1 since c = -a when a+c =0. 
Thus « and & are negative reciprocals of each other that are interchanged by 180 
degree rotation of E. As examples we have: 


etxy-y?= (x- HE y)(x- 1») 
2x’ + xy —2y° = 2(x - Hy) (x - avy) 


One can consider a pair of parallel tangent lines to E as the limit of a pair of inter- 
secting tangents where the point of intersection moves farther and farther away from 
E in a certain direction which becomes the direction of the pair of parallel tangents. 
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With the various things we have learned about quadratic forms so far, let us 
return to the basic representation problem of determining what values a given form 
Q(x, y) = ax? + bxy + cy? can take on when x and y are integers, or in other 
words, which numbers can be represented as ax? + bxy + cy* for some choice of 
integers x and y. Remember that it suffices to restrict attention to the values of Q 
appearing in the topograph since these are the values for primitive pairs (x,y), and 
to get all other values one just multiplies the values in the topograph by arbitrary 
squares. With this in mind we will adopt the following convention in the rest of the 
book: 


When we say that a form Q represents a number n we mean that n = Q(x, y) 
for some primitive pair of integers (x,y) # (0,0). 


This differs from the traditional terminology in which any solution of n = Q(x, y) is 
called a representation of n, without requiring (x, y) to be a primitive pair, and when 
(x,y) is primitive it is called a proper or primitive representation of n. However, 
since we will rarely consider the case that (x,y) is not a primitive pair, it will save 
many words not to have to insert the extra modifier for every representation. 

We will focus on forms that are either elliptic or hyperbolic, as these are the most 
interesting Cases. 


6.1 Three Levels of Complexity 


In this section we will look at a series of examples to try to narrow down what sort 
of answer one could hope to obtain for the representation problem. The end result 
will be a reasonable guess that will be verified in the rest of this chapter and the next 
one, at least for fundamental discriminants. For nonfundamental discriminants there 
is sometimes a small extra wrinkle that seems to be rather subtle and more difficult 
to analyze. 

As a first example let us try to find a general pattern in the values of the form 
x* +y? . In view of the symmetry of the topograph for this form it suffices to look just 
in the first quadrant of the topograph. Part of this quadrant is shown in the figure 
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below, somewhat distorted to fit more numbers into the picture. What is shown is all 
the numbers in the topograph that are less than 100. 


Qixyy=xtty? o i 


At first glance it may be hard to detect any patterns here. Both even and odd numbers 
occur, but none of the even numbers are divisible by 4 so they are all twice an odd 
number, and in fact an odd number that appears in the topograph. Considering the 
odd numbers, one notices they are all congruent to 1 mod 4 and not 3 mod 4, which 
is the other possibility for odd numbers. On the other hand, not all odd numbers 
congruent to 1 mod 4 appear in the topograph. Up to 100, the ones that are missing 
are 9, 21, 33, 45, 49, 57, 69, 77, 81, and 93. Each of these has at least one prime 
factor congruent to 3 mod 4, while all the odd numbers that do appear have all their 
prime factors congruent to 1 mod 4. Conversely, all products of primes congruent 
to 1 mod 4 are in the topograph. 
This leads us to guess that the following might be true: 


Conjecture. The numbers that appear in the topograph of x° + y? are precisely 
the numbers n = 2“p,p>--+p, where a < 1 and each p; is a prime congruent to 
1 mod 4. Consequently, the values of the quadratic form Q(x, y) = x? + y? as x 
and y range over all integers (not just the primitive pairs) are exactly the numbers 
n = M*p,po--+ Pp, where m is an arbitrary integer and each p; is either 2 or a 
prime congruent to 1 mod 4. 


In both statements the index k denoting the number of prime factors p; is allowed 
to be zero as well as any positive integer. The restriction a < 1 in the first statement 
disappears in the second statement since higher powers of 2 can occur when we 
multiply by arbitrary squares. We will prove the conjecture later in the chapter. 

A weaker form of the conjecture can be proved just by considering congruences 
mod 4 as follows. An even number squared is congruent to 0 mod 4 and an odd 
number squared is congruent to 1 mod 4, so x* + y? must be congruent to 0, 1, or 
2 mod 4. Moreover, the only way that x* + y? can be 0 mod 4 is for both x and y to 
be even, which cannot happen for primitive pairs. Thus all numbers in the topograph 
must be congruent to 1 or 2 mod 4. This says that the odd numbers in the topograph 
are congruent to 1 mod 4 and the even numbers are each twice an odd number. 

However, these simple observations say nothing about the role played by primes 
and prime factorizations, nor do they include any positive assertions about which 
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numbers actually are represented by x° + y*. It definitely takes more work to show 
for example that every prime p = 4k+1 can be represented as the sum of two squares. 


Let us look at a second example to see whether the same sorts of patterns occur, 
this time for the form Q(x, y) = x? +2 y’. Here is a portion of its topograph showing 
all values less than 100, with the lower half of the topograph omitted since it is just 
the mirror image of the upper half: 


Q(x, y) =x°+2y? 


Again the even values are just the doubles of the odd values. The odd prime values are 
3,11, 17, 19, 41, 43, 59, 67, 73, 83,89,97 and the other odd values are all the products 
of these primes. The odd prime values are not determined by their values mod 4 
in this case, but instead by their values mod 8 since the primes we just listed are 
exactly the primes less than 100 that are congruent to 1 or 3 mod 8. Apart from 
this change, the answer to the representation problem for x* + 2y? is completely 
analogous to the answer for x* + y*. Namely, the numbers represented by x° + 2y? 
are the numbers n = 2“p,p)--+p, with a < 1 and each p; a prime congruent to 1 
or 3 mod 8. Using congruences mod 8 we could easily prove the weaker statement 
that all numbers represented by x° + 2y° must be congruent to 1, 2,3, or 6 mod 8, 
so all odd numbers in the topograph must be congruent to 1 or 3 mod 8 and all even 
numbers must be twice an odd number. 


These two examples were elliptic forms, but the same sort of behavior can occur 
for hyperbolic forms as we see in the next example, the form x° — 2y°. The negative 
values of this form happen to be just the negatives of the positive values, so we need 
only show the positive values in the topograph: 


Q(x, y) = xt -2y? a a 


98 
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Here the primes that occur are 2 and primes congruent to +1 mod 8. The nonprime 
values that occur are the products of primes congruent to +1 mod 8 and twice these 
products. Again there is a weaker statement that can be proved using just congruences 
mod 8. 


In these three examples the guiding principle was to look at prime factorizations 
and at primes modulo certain numbers, the numbers 4, 8, and 8 in the three cases. 
Notice that these numbers are just the absolute values of the discriminants —4, —8, 
and 8. Looking at primes mod |A| turns out to be a key idea for all quadratic forms. 

Another example of the same sort is the form x° + xy + y° of discriminant —3. 
This time it is the prime 3 that plays a special role rather than 2. 


Q(x, y) =x°+xy +y? 93 


We only have to draw one-sixth of the topograph because of all the symmetries. Notice 
that all the values are odd, so the prime 2 plays no role here. Since the discriminant 
is —3 we are led to consider congruences mod 3. The primes in the topograph are 
3 and the primes congruent to 1 mod 3 (which in particular excludes the prime 2), 
namely the primes 7, 13,19, 31,37, 43,61, 67, 73, 79,97. The nonprime values are the 
products of these primes with the restriction that the prime 3 never has an exponent 
greater than 1. This is analogous to the prime 2 never having an exponent greater 
than 1 in the preceding examples. In all four examples the “special” primes whose 
exponents are restricted are just the prime divisors of the discriminant. This is a 
general phenomenon, that primes dividing the discriminant behave differently from 
primes that do not divide the discriminant. 

A special feature of the discriminants —4, —8, 8, and —3 is that in each case all 
forms of that discriminant are equivalent. We will see that the representation problem 
always has the same type of answer for discriminants with a single equivalence class 
of forms. 


Before going on to the next level of complexity let us digress to describe a nice 
property that forms of the first level of complexity have. As we know, if an equa- 
tion Q(x,y) = n has an integer solution (x,y) then so does Q(x, y) = mên for 
every integer m. The converse is not always true however. For example the equation 
2x° + 7y* = 9 has the solution (x,y) = (1,1) but 2x? + 7y* = 1 obviously has no 
solution with x and y integers. Nevertheless, this converse property does hold for 
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forms such as those in the preceding four examples where the numbers n for which 
Q(x,y) =n has an integer solution are exactly the numbers that can be factored as 
N=M'pipo::: p, for primes p; satisfying certain conditions and m an arbitrary 
integer. This is because if a number n has a factorization of this type then we can 
cancel any square factor of n and the result still has a factorization of the same type. 

Let us apply this “square-cancellation” property in the case of the form x? +y? to 
determine the numbers n such that the circle x? + y? = n contains a rational point, 
and hence, as in Chapter 0, an infinite dense set of rational points. Suppose first that 
the circle x? + y? = n contains a rational point, so after putting the two coordinates 
over a common denominator the point is (x, y) = (4/-,2/). The equation x° +y? =n 
then becomes a° + b? = c*n. This means that the equation x* + y* = c°n has 
an integer solution. Then the square-cancellation property implies that the original 
equation x° + y? = n has an integer solution. Thus we see that if there are rational 
points on the circle x? + y? = n then there are integer points on it. This is not 
something that is true for all quadratic curves, as shown by the example of the ellipse 
2x° + 7y* = 1 which has rational points such as (1⁄,1⁄3) but no integer points. 

From the solution to the representation problem for x* + y? we deduce that the 
circle x° + y* = n contains rational points exactly when n = mp, Po- pg where 
m is an arbitrary integer and each p; is either 2 or a prime congruent to 1 mod 4. 
The first few values of n satisfying this condition are 1,2,4,5,8,9,10,13,16,17, 
18,20,- 


Now let us look at some examples with a second level of complexity. First consider 
the case of discriminant 40 where the class number is 2 and two nonequivalent forms 
are x° — 10y* and 2x* — 5y°. The topographs below show the positive values less 
than 100. The topographs are periodic and also have mirror symmetry so it suffices 
to show half of one period. There is no need to show any more of the negative values 
since these will just be the negatives of the positive values. 


Q(x, y) = x*-10y* 
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For the form x° — 10y° the prime values less than 100 are 31,41, 71, 79,89. 
These are the primes congruent to +1 or +9 mod 40, the discriminant. However, in 
contrast to what happened in the previous examples, there are many nonprime values 
of this form that are not products of these prime values. The prime factors of these 
nonprime values are 2,3,5,13,37,43, none of which occur in the topograph of the 
first form. Rather miraculously, these prime values are realized instead by the second 
form 2x* — 5y*. The prime values this form takes on are 2 and 5, which are the 
prime divisors of the discriminant 40, along with primes congruent to +3 and +13 
mod 40, namely 3, 13,37, 43,53,67, and 83. 

Apart from the primes 2 and 5 that divide the discriminant, the possible values 
of primes mod 40 are +1,+3,+7,+9,+11,+13,+17,+19 since even numbers and 
multiples of 5 are excluded. There are sixteen different congruence classes here, 


and exactly half of them, eight, are realized by one or the other of the two forms 
x? —10y* and 2x* — 5y*, with four classes realized by each form. The other eight 
congruence classes are not realized by any form of discriminant 40 since every form 
of discriminant 40 is equivalent to one of the two forms x° — 10y* or 2x* —5y7, as 
is easily checked by the methods from the previous chapter. 

This turns out to be a general phenomenon valid for all elliptic and hyperbolic 
forms: If one excludes the primes that divide the discriminant, then the prime values 
of quadratic forms of that discriminant are exactly the primes in half of the congruence 
classes modulo the discriminant of numbers coprime to the discriminant. This will 
be proved in Proposition 6.22. Also, each form represents primes in the same number 
of congruence classes. For A = 40 this is four congruence classes for each form. 

The primes 2 and 5 that divide the discriminant occur in the topographs only to 
the first power, nor are any numbers in the topographs divisible by 2° or 5°. This 
agrees with what happened in the earlier examples. Apart from this restriction it 
appears that each product of primes represented by Q, or Q, is also represented 
by Qı or Q. The problem is to decide which form represents which products. For 
numbers in the topographs not divisible by 2 or 5 it seems that these numbers are 
subject to the same congruence conditions as for primes, so they are congruent to +1 
or +9 for Q, and to +3 or +13 for Q3. 

If one includes numbers divisible by 2 or 5 the following statements seem to be 
true, provided that numbers divisible by 2° or 5° are excluded: 


= The product of two numbers represented by Q, is again represented by Q,. 

= The product of two numbers represented by Q, is represented by Q,. 

= The product of a number represented by Q, with a number represented by Q» 
is represented by Q3. 
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To illustrate the first statement, the numbers 6, 9, and 10 appear in the topograph 
of Q, hence so do 6-9, 9-9, and 9-10, but not 6-10 since this is divisible by 2°. 
For the second statement, the numbers 2, 3, and 5 are in the topograph of Q, so 
2-3, 3-3, 2-5, and 3-5 are in the topograph of Q, but not 2-2 or 5-5. The product 
2-3-5 is then in the topograph of Q> by the third statement. 

An abbreviated way of writing the three rules is by the formulas Q,Q, = Q}, 
Q-Q; = Qı, and Q,Q> = Q. One can see that these are formally the same as the 
rules for addition of integers mod 2: 0+0=0,1+1=0,and 0+1=1. The two 
formulas Q,Q, = Q; and Q,Q> = Q, say that Q, serves as an identity element “1” 
for this multiplication operation, and then the formula QQ, = Q, can be interpreted 
as saying that Q, is equal to its own inverse, so Q, = On": 

This way of “multiplying” forms is more than just shorthand notation, and in 
Chapter 7 we will develop a general method for forming products of primitive forms 
of a fixed discriminant that will be a key ingredient in reducing the representation 
problem to the special case of representing primes. 


The various observations we have made so far about the two forms of discriminant 
40 lead to the following: 


Conjecture. The positive numbers represented by either Q, or Q, are exactly the 
products 245p p> -+ +p, where a,b < 1 and each p; is a prime congruent to +1, 
+3, +9, or +13 mod 40. The form Q, represents the primes p; = +1 and +9 
while Q, represents 2, 5, and the primes p; = +3 and +13. One can determine 
which form will represent a product 29S" Bp; - - +p, by the rule that if the number 
of terms in the product that are represented by Q, is even then the product is 
represented by Q, and if it is odd then the product is represented by Q». 


For example, the topograph of Q, contains the even powers of 3 while the topo- 
graph of Q, contains the odd powers. Another consequence is that the even values 
in one topograph are just the doubles of the odd values in the other topograph. 

This characterization of numbers represented by these two forms also implies 
that no number is represented by both Q, and Q,. However, for some discriminants 
it is possible for two nonequivalent forms of that discriminant to represent the same 
nonzero number, as we will see. 

The Conjecture will be proved piece by piece as we gradually develop the neces- 
sary general theory. The first statement will be an application of Theorem 6.8 together 
with later facts in Section 6.2. The second statement will be an application of Propo- 
sition 6.19 and the rest of the Conjecture will use results from Chapter 7, particularly 
Theorem 7.7. 
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Let us look at another example where the representation problem has an answer 
that is qualitatively similar to the preceding example but just a little more complicated, 
the case of discriminant —84. Here there are twice as many equivalence classes of 
forms, four instead of two, with topographs shown below. 


Q(x, yY) = x*+21y? Q(X, Y) =3x? + 7y? 


Q3(X, Vv) =2x7+2xy+11ly? Q(x, yY) =5x?+4xy +5y° 
3 ’ oe SS et 


74 


The primes dividing the discriminant —84 are 2, 3, and 7, and these primes are each 
represented by one of the forms. In fact the divisors of the discriminant that appear 
in the topographs are 1,2,3,6,7,14,21, and 42 which are precisely the squarefree 
divisors of the discriminant, where a number is called squarefree if it is not divisible 
by any square greater than 1. These squarefree divisors of A are exactly the numbers 
appearing on reflector lines of mirror symmetries of the topographs. This was the 
case also in the previous examples, as one can check, and is a general phenomenon 
for fundamental discriminants as a consequence of Corollary 5.7, Theorem 6.8, and 
Proposition 6.17. 

For the primes not dividing the discriminant, we will show in Section 6.3 that the 
primes represented by each form are as follows: 


= For Q; the primes p = 1,25,37 mod 84. 

= For Q, the primes p = 19,31,55 mod 84. 

= For Q; the primes p = 11, 23,71 mod 84. 

= For Q, the primes p =5,17,41 mod 84. 
This agrees with what is shown in the four topographs above, and one could expand 
the topographs to get further evidence that these are the right answers. Passing from 
primes to arbitrary numbers appearing in at least one of the topographs, these appear 
to be exactly the products 243°7°p, ---p, with a,b,c < 1 and each p; one of the 
other primes represented by Q,, Q2, Q3, or Q4. 
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One can work out hypothetical rules for multiplying the forms by considering 
how products of two primes are represented. For example, 3 is represented by Q» 
and 11 is represented by Q3, while their product 3-11 = 33 is represented by Q4, so 
we might guess that QQ; = Q4. Some other products that give the same conclusion 
are 3-2 = 6, 3-23 = 69, 7-2 = 14, 7-11 = 77, and 31-2 = 62. In the same way one 
can determine tentative rules for all the products Q;Q,, with the following results: 


= The principal form Q, acts as the identity, so Q,Q; = Q; for each i. 
= Q,Q; =Q; for each i so each Q; equals its own inverse. 
= The product of any two out of Q», Q3, Q; is equal to the third. 


These multiplication rules are formally identical to how one would add pairs (m,n) of 
integers mod 2 by adding their two coordinates separately. The form Q, corresponds 
to the pair (0,0) and the first of the three rules above becomes the formula (0,0) + 
(m,n) = (m,n). The forms Q>, Q3, and Q; correspond to (1,0), (0,1), and (1,1) 
in any order, and then the second rule above becomes (m,n) + (m,n) = (0,0) which 
is valid for addition mod 2, while the third rule becomes the fact that the sum of any 
two of (1,0), (0,1), and (1,1) is equal to the third if we do addition mod 2. 

The multiplication rules determine which form represents a given number n by 
replacing each prime in the prime factorization of n by the form Q; that represents 
it, then multiplying out the resulting product using the three multiplication rules, 
keeping in mind that 2, 3, and 7 can never occur with an exponent greater than 1. 
For example, for n = 70 = 2-5-7 we get the product Q3;Q,Q> which equals Q; 
and so 70 is represented by Q}, as the topograph shows. For n = 66 = 2-3-11 
we get Q3Q,Q3 = Q, and 66 is represented by Q>. In general, for a number 
n = 243? 7°p,-+*P, we can determine which form represents n by the follow- 
ing steps. First compute the number q; of prime factors of n represented by Q;. 
Next compute the sum q,(0,0) + qo(1,0) + q3(0,1) + q44(1,1) = (Go + 44,43 + q4) 
where (0,0), (1,0), (0,1), (1,1) correspond to Q,,Q>,Q3,Q,4 respectively. The re- 
sulting sum (r,s) then tells which form represents n. 


An interesting feature of all the forms at the first or second level of complexity 
that we have examined so far is that their topographs have mirror symmetry. This is 
actually a general phenomenon: Whenever all the forms of a given discriminant have 
mirror symmetry, then one can determine which primes are represented by each form 
just in terms of congruence conditions modulo the discriminant. And in fact this is 
the only time when congruences modulo the discriminant determine how primes are 
represented, at least if one restricts attention just to primitive forms. This will be 
shown in Corollary 6.28. In Chapter 5 we called discriminants for which all primitive 
forms have mirror symmetry fully symmetric discriminants, and we observed that 
they are unfortunately rather rare, with only 101 negative discriminants known to 
have this property, and probably no more. 
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Now we move on to the third level of complexity, illustrated by the case A = —56 
where there are three equivalence classes of forms, with topographs shown below. 
The first two topographs have mirror symmetry but the third topograph does not, so 
the third form counts twice when determining the class number for discriminant —56, 
which is therefore 4 rather than 3. 


Q(x, Y) = x*+14y? 


3 5 
138 6 
118 
101 / 20 as 26 13 19 42 \ 75 
90 / 35 45 |122 
139\ 75 117 
69 83 
150 
115 133 


The behavior of divisors of the discriminant is the same as in the previous examples. 
Only the squarefree divisors appear, 1, 2, 7, and 14, and these are the numbers 
appearing on the reflector lines. 

In the examples at the first two levels of complexity it was possible to determine 
which numbers are represented by a given form by looking at primes and which con- 
gruence classes they fall into modulo the discriminant. The primes represented by a 
given form were exactly the primes in certain congruence classes modulo the discrim- 
inant. This is no longer true for discriminant —56 however. For example the primes 
23 and 79 are congruent mod 56, and yet 23 is represented by Q, = x? +14y° since 
Q: (3,1) = 23, while 79 is represented by Q, = 2x° + 7y? since Q>(6,1) = 79. 
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Another nice property that held in the previous examples was that no number 
appeared in more than one topograph for the given discriminant, but this too fails 
for discriminant —56 since there are many nonprimes that occur in the topographs 
of both Q, and Q, starting with 15, 30, 39, 57,65, 78,95, 105,114,130, and 135. 

Apart from the primes 2 and 7 that divide the discriminant —56, all other primes 
belong to the following 24 congruence classes mod 56, corresponding to odd num- 
bers less than 56 not divisible by 7: 


13591113151719 


N 


25 27 29 31 33 37 39 41 43 45 47 51 53 55 


The six congruence classes whose prime elements are represented by Q, or Q, are 
indicated by underlines, and the six congruence classes whose prime elements are 
represented by Q3 are indicated by overlines. Primes not represented by any of the 
three forms are in the remaining twelve congruence classes. 

The new thing that happens in this example is that one cannot tell whether a 
prime is represented by Q, or Q, just by considering congruence classes mod the 
discriminant. We saw this for the pair of primes 23 and 79, and another such pair 
visible in the topographs is 71 and 127. By extending the topographs we could find 
many more such pairs. One might try using congruences modulo some other number 
besides 56, but it is known that this does not help. 

Congruences mod 56 suffice to tell which primes are represented by Q3, but 
there is a different sort of novel behavior involving Q; when we look at representing 
products of primes. To illustrate this, observe that the primes 3 and 5 are represented 
by Q; but their product 15 is represented by both Q, and Q». This means there is 
some ambiguity about whether the product Q3Q3 should be Q; or Q. The same 
thing happens in fact for any pair of coprime numbers represented by Q; , for example 
5 and 6 whose product is represented by both Q, and Q3. 

For other products Q;Q; there seems to be no ambiguity. The principal form Q; 
acts as the identity for multiplication, while QQ, = Qı and QQ; = Q3, although 
this last formula is somewhat odd since it seems to imply that Q; does not have a 
multiplicative inverse since if it did, we could multiply the equation QQ; = Q; by 
this inverse to get Q, = Q}. 

There is a way out of these difficulties, discovered by Gauss. The troublesome 
form Q; is different from the other forms in this example and in the preceding 
examples in that it does not have mirror symmetry. Thus the equivalence class of 
Q; splits into two proper equivalence classes, with Q; having a mirror image form 
Q, = 3x" —-2xy +5y* obtained from Q; by changing the sign of either x or y and 
hence changing the coefficient of xy to its negative. Using Q, we can then resolve the 
ambiguous product Q3Q; by setting Q3;Q3 = Q; = Q4Q, and Q3Q, = Q; so that Q, 
is the inverse of Q;. This means that each Q; has its inverse given by the mirror im- 
age topograph since Q, and Q, have mirror symmetry and equal their own inverses. 
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The rigorous justification for the formulas Q3;Q3 = Q; = Q,Q, and Q3Q, = Q; will 
come in Chapter 7, but for the moment one can check that these formulas are at least 
consistent with the topographs. 

Since Q4 = Q, we have Q3 = Q = Q,. Multiplying the equation Q3 = Q, by 
Q,, the inverse of Q3, gives Q3 = Q,. Thus all four proper equivalence classes of 
forms are powers of the single form Q; since Q$ = Q», Q3 = Q4, and Q$ = Q,. This 
is corroborated by the representations of powers of 3 since 3 is represented by Q3, 
3° by Q = Qo, 3° by Q3 = Qu, and 3% by Q = Q,. Products of powers Q% are 
computed by adding exponents mod 4 since Q3 is the identity. Thus multiplication 
of the four forms is formally identical with addition of integers mod 4. The earlier 
doubtful formula Q Q3 = Q; is resolved into the two formulas QQ; = Q, and 
QQ, = Q3, which become Q5Q; = Q3 and Q3Q3 = Q3 = Q}. 

The appearance of the same number in two different topographs is easy to explain 
now that we have two forms Q; and Q, representing exactly the same numbers. For 
example, to find all appearances of the number 15 = 3-5 in the topographs we observe 
that its prime factors 3 and 5 appear in the topographs of both Q, and Q; so 15 
will appear in the topographs of Q3Q3 = Q2, Q3Q4 = Q,, and Q4Q,4 = Q>, although 
this last formula gives no new representations. 

The procedure for finding which forms represent a number n = 247” Pı’: Pk 
with a,b < 1 and primes p; different from 2 or 7 is to replace each prime factor in 
this product by a form Q; that represents it, then multiply out the resulting product of 
forms Q;. There is also an extra condition that will be justified in Chapter 7: Whenever 
a prime p; appears more than once in the prime factorization of n, we should replace 
all of its appearances by the same Qj. For example, the forms representing 18 = 2- 3? 
are just the products Q03 = Q; and Q-Q = Q; and not Q,Q3Q, = Q3, as one can 
see in the topographs. Similarly, 9 = 3-3 is represented only by Q5 =Q, = Qi and 
not by Q3Q4 = Q}. 


We will show in Chapter 7 that the set of proper equivalence classes of primitive 
forms of fixed discriminant always has a multiplication operation compatible with 
multiplying values of forms of that discriminant in the way illustrated by the preceding 
examples. This multiplication operation gives this set the structure of a group, that 
is, a Set with an associative multiplication operation for which there is an element of 
the set that functions as an identity for the multiplication, and such that each element 
of the set has a multiplicative inverse in the set whose product with the given element 
is the identity element. The set of proper equivalence classes of primitive forms with 
this group structure is called the class group for the given discriminant. The identity 
element is the class of the principal form, and the inverse of a class is obtained by 
taking the mirror image topograph. 

The class group has the additional property that the multiplication is commuta- 
tive. This makes its algebraic structure much simpler than the typical noncommuta- 
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tive group. An example of a noncommutative group that we have seen is the group 
LF(Z) of linear fractional transformations, where the multiplication comes from mul- 
tiplication of 2 x 2 matrices, or equivalently, composition of the transformations. 


For a given discriminant, if the numbers represented by two primitive forms can- 
not be distinguished by congruences modulo the discriminant, then these two forms 
are said to belong to the same genus. Thus in the preceding example of discriminant 
—56 the two forms Q, and Q, are of the same genus while Q; is of a different genus 
from Q; and Q;, so there are two different genera (“genera” is the plural of “genus”). 

Equivalent forms always belong to the same genus since their topographs contain 
exactly the same numbers. The first two of the three levels of complexity we have 
described correspond to the discriminants where there is only one equivalence class 
in each genus. As we stated earlier, this desirable situation is also characterized by 
the condition that all primitive forms of the given discriminant have mirror symmetry. 
For larger discriminants there can be large numbers of genera and large numbers of 
equivalence classes within a genus. However, for a fixed discriminant there are always 
the same number of proper equivalence classes within each genus, as we will show in 
Proposition 7.24. This is illustrated by the case A = —56 where one genus consists of 
Q, and Q, and the other genus consists of Q} and Q,. 


The examples in this section show the significance of primes in certain congruence 
classes for solving the representation problem. In the examples there seems to be no 
shortage of primes in each of the relevant congruence classes. For example, for the 
form x? + y? the primes represented, apart from 2, seem to be the primes congruent 
to 1 mod 4, the primes of the form 4k + 1 starting with 5, 13,17, 29,37,41,53,---. 
The other possibility for odd primes is the sequence 3,7,11,19,23,31,43,47,---, 
primes of the form 4k + 3, or equivalently 4k — 1. 

Such sequences form arithmetic progressions an + b for fixed positive integers 
a and b and varying n = 0, 1, 2,3,---. It is natural to ask whether there are infinitely 
many primes in each arithmetic progression an +b. For this to be true an obvious re- 
striction is that a and b should be coprime since any common divisor of a and b will 
divide every number an + b, so there could be at most one prime in the progression. 

A famous theorem of Dirichlet from 1837 asserts that every arithmetic progres- 
sion an + b with a and b coprime contains an infinite number of primes. This can 
be rephrased as saying that within each congruence class of numbers x = b mod a 
there are infinitely many primes whenever a and b are coprime. Dirichlet’s theorem 
actually says more, that primes are approximately equally distributed among the var- 
ious congruence classes mod a for a fixed a. For example, there are approximately 
as many primes p = 4n + 1 as there are primes p = 4n-1. 

Dirichlet’s Theorem is not easy to prove, and a proof would require methods quite 
different from anything else in this book so we will not be giving a proof. However a 
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few special cases of Dirichlet’s Theorem can be proved by elementary arguments. The 
simplest case is the arithmetic progression 3,7,11,- -- of numbers n = 4n- 1, using 
a variant of Euclid’s proof that there are infinitely many primes. Recall how Euclid’s 
argument goes. Suppose that p,,-:--,p , is a finite list of primes, and consider the 
number N = p,;:-::p, +1. This must be divisible by some prime p, but p cannot 
be any of the primes p; on the list since dividing p; into N gives a remainder of 1. 
Thus no finite list of primes can be complete and hence there must be infinitely many 
primes. 

To adapt this argument to primes of the form 4n — 1, suppose that p,,---, pp is 
a finite list of such primes, and consider the number N = 4p, ---p, —1. The prime 
divisors of N must be odd since N is odd. If all these prime divisors were of the form 
4n +1 then N would be a product of numbers of the form 4n + 1 hence N itself 
would have this form, contradicting the fact that N has the form 4n — 1. Hence N 
must have a prime factor p = 4n — 1. This p cannot be any of the primes p; since 
dividing p; into N gives a remainder of —1. Thus no finite list of primes 4n — 1 can 
be a complete list. 

This argument does not work for primes p = 4n + 1 since a number N = 
4p,-+--p, +1 canbe a product of primes of the form 4n — 1, for example 21 = 3-7, 
so one could not deduce that N had a prime factor p = 4n +1. 

However, the quadratic form x° + y* can be used to show there are infinitely 
many primes p = 4n + 1. In Proposition 6.18 we will show that for each discriminant 
A there are infinitely many primes represented by forms of discriminant A. In the 
case A = —4 all forms are equivalent to the form x? + y? , so this form must represent 
infinitely many primes. None of these primes can be of the form 4n—-1 since all values 
of x? + y* are congruent to 0, 1, or 2 mod 4, as squares are always 0 or 1 mod 4. 
Thus there must be infinitely many primes p = 4n + 1. 

The same arguments work also for primes p = 3n + 1 and p = 3n-1. For 
p = 3n — 1 one argues just as for 4n — 1, using numbers N = 3p,---p,—1. For 
p = 3n+1 one uses the form x* + xy + y? of discriminant —3. Here again all 
forms of this discriminant are equivalent so Proposition 6.18 says that x? + xy + y? 
represents infinitely many primes. All values of x7+xy+y? are congruent to 0 or 1 
mod 3 as one can easily check by listing the various possibilities for x and y mod 3. 
Thus there are infinitely many primes p = 3n +1. 

We can try these arguments for arithmetic progressions 5n + 1 and 5n + 2 but 
there are problems. The Euclidean argument we have given fails in each case for much 
the same reason that it failed for primes p = 4n + 1. For the approach via quadratic 
forms we would use the form x° + xy — y* of discriminant 5. This is the only form 
of this discriminant, up to equivalence, so Proposition 6.18 implies that it represents 
infinitely many primes. The methods in the next section will show that the primes 
represented by this form are the primes p = 5n + 1, so there are infinitely many 
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primes p = 5n+1 or p = 5n-1 but we cannot be more specific than this. Dirichlet’s 
Theorem says there are infinitely primes of each type, and in fact there are fancier 
forms of the Euclidean argument that prove this, but these Euclidean arguments do 
not work for the other cases p = 5n + 2. 

We have just seen three quadratic forms that represent infinitely many primes, for 
discriminants —4, —3, and 5, and Proposition 6.18 provides other examples for each 
discriminant with class number 1. (Nonprimitive forms obviously cannot represent 
infinitely many primes, so these forms can be ignored.) For discriminants with larger 
class numbers Proposition 6.18 only implies that there is at least one form represent- 
ing infinitely many primes. However there is another hard theorem of Dirichlet which 
does say that each primitive form of nonsquare discriminant represents infinitely 
many primes. 


Exercises 


1. For the form Q(x, y) = x? + xy — y° do the following things: 


(a) Draw enough of the topograph to show all the values less than 100 that occur 
in the topograph. This form is hyperbolic and it takes the same negative values as 
positive values, so you need not draw all the negative values. 


(b) Make a list of the primes less than 100 that occur in the topograph, and a list of 
the primes less than 100 that do not occur. 

(c) Characterize the primes in the two lists in part (b) in terms of congruence classes 
mod |A| where A is the discriminant of Q. 

(d) Characterize the nonprime values in the topograph in terms of their factorizations 
into primes in the lists in part (b). 


(e) Summarize the previous parts by giving a simple criterion for determining the 
numbers n such that Q(x, y) = n has an integer solution (x,y), primitive or not. 
The criterion should say something like Q(x, y) = n is solvable if and only if n = 
mp, -+ +p, where each p; is a prime such that... 


(e) Check that all forms having the same discriminant as Q are equivalent to Q. 

2. Do the same things for the form x? + xy + 2y7, except that this time you only 
need to consider values less than 50 instead of 100. 

3. For discriminant A = —24 do the following: 


(a) Verify that the class number is 2 and find two quadratic forms Q, and Q, of 
discriminant —24 that are not equivalent. 


(b) Draw topographs for Q, and Q, showing all values less than 100. (You do not 
have to repeat parts of the topographs that are symmetric.) 
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(c) Divide the primes less than 100 into three lists: those represented by Q,, those 
represented by Q,, and those represented by neither Q, nor Q». (No primes are 
represented by both Q, and Q3.) 

(d) Characterize the primes in the three lists in part (c) in terms of congruence classes 
mod |A| = 24. 

(e) Characterize the nonprime values in the topograph of Q, in terms of their factor- 
izations into primes in the lists in part (c), and then do the same thing for Q». Your 
answers should be in terms of whether there are an even or an odd number of prime 
factors from certain of the lists. 

(f) Summarize the previous parts by giving a criterion for which numbers n the equa- 
tion Qı (x,y) = n has an integer solution and likewise for the equation Q(x, y) =n. 


4. This problem will show how things can be more complicated than in the previous 
problems. 

(a) Show that the number of equivalence classes of forms of discriminant —23 is 2 
while the number of proper equivalence classes is 3, and find reduced forms Q, and 
Q, of discriminant —23 that are not equivalent. 

(b) Draw the topographs of Q, and Q, up to the value 70. (Again you do not have to 
repeat symmetric parts.) 

(c) Find a number n that occurs in both topographs, and find the x and y values that 
give Qi (x1, X1) = N = Qo(Xo, Y2). (This sort of thing never happens in the previous 
problems.) 

(d) Find a prime p, in the topograph of Q, and a different prime p, in the topograph 
of Q, such that pı and p, are congruent mod |A| = 23. (This sort of thing also 
never happens in the previous problems.) 


5. Show there are infinitely many primes of the form 6m — 1 by an argument similar 
to the one used for 4m —- 1. 


6. Consider a discriminant A = a°, q > 0, corresponding to 0-hyperbolic forms. Us- 
ing the description of the topographs of such forms obtained in the previous chapter, 
show: 

(a) Every number is represented by at least one form of discriminant A, so in particular 
all primes are represented. 

(b) The primes represented by a given form of discriminant A are exactly the primes 
in certain congruence classes mod q (and hence also mod A). 

(c) For each of the values q = 1, 2, 7, and 15 determine the class number for discrim- 
inant A = q° and find which primes are represented by the forms in each equivalence 
class. 
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6.2 Representations in a Fixed Discriminant 


The problem of determining the numbers represented by a given form is dif- 
ficult in general, so in this section we will consider the somewhat easier question of 
determining which numbers n are represented by at least one form of a given discrim- 
inant A, without specifying which form this will be. We refer to this as representing 
n in discriminant A. 

On several occasions we will make use of the following fact: A form Q represents 
a number a if and only if Q is equivalent to a form ax* +bxy+cy* with leading co- 
efficient a. To see this, note first that the form ax? +bxy+cy* obviously represents 
a when (x,y) = (1,0), hence any form equivalent to ax? + bxy + cy? also repre- 
sents a. Conversely, if a form Q represents a then a appears in the topograph of 
Q, and by applying a suitable linear fractional transformation we can bring the region 
where a appears to the 1% region, changing Q to an equivalent form ax*+bxy+cy* 
where c is the new label on the 9, region and b is the new label on the edge between 
the YY and 9, regions. 

Here is our first use of this principle: 


Proposition 6.1. If a number n is represented in discriminant A then so is every 
divisor of n. 


Thus for representations in a given discriminant, if we find which primes are 
represented and then which products of these primes are represented, we will have 
found all numbers that are represented. 


Proof: If n is represented in discriminant A then there is a form nx? +bxy+cy" 
of discriminant A. If n factors as n = n,n», then n} is represented by the form 
nx? + bxy +n cy? which has the same discriminant as nx? + bxy + cy?. o 


There is a simple congruence criterion for when a number is represented in a 
given discriminant: 


Proposition 6.2. There exists a form of discriminant A that represents n if and 
only if A is congruent to a square mod 4n. 


Note that if n is negative then “mod 4n” means the same thing as “mod 4|n|” 
since being divisible by a number d is equivalent to being divisible by —d when we 
are considering both positive and negative numbers. 


Proof: Suppose n is represented by a form Q of discriminant A, so n appears in the 
topograph of Q. If we look at an edge of the topograph borderinga n 

region labeled n then we obtain an equation A = h? —4nk where h is h 

the label on the edge and k is the label on the region on the opposite k 
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side of this edge. The equation A = h? —4nk implies the congruence A = h? mod 4n 
so A is a square mod 4n. 

Conversely, suppose that A is the square of some integer h mod 4n. This means 
that hê — A is an integer times 4n, or in other words h? — A = 4nk for some k. This 
equation can be rewritten as A = h? — 4nk, so the form nx? + hxy + ky? has 
discriminant A, and this form represents n when (x,y) = (1,0). oO 


Let us see what this proposition implies about representing small numbers n. 
For n = 1 it says that there is a form of discriminant A representing 1 if and only 
if A is a square mod 4. The squares mod 4 are 0 and 1, and we already know that 
discriminants of forms are always congruent to 0 or 1 mod 4. So we conclude that for 
every possible value of the discriminant there exists a form that represents 1. This is 
not new information, however, since the principal forms x? +dy? and x7+xy+dy" 
represent 1 and there is a principal form in each discriminant. 

In the next case n = 2 the possible values of the discriminant mod 4n = 8 are 
0,1,4,5, and the squares mod 8 are 0,1,4 since 0° = 0, (+1)* = 1, (+2)? = 4, 
(+3)? = 1, and (+4)? = 0. Thus 2 is not represented by any form of discriminant 
A when A = 5 mod 8, but for all other discriminants there is a form representing 2. 


Explicit forms representing 2 are 2x*—ky* for A = 8k, 2x°+xy—-ky* for A = 8k+1, 
and 2x* + 2xy—ky* for A = 8k +4. 

Moving on to the next case n = 3, the discriminants mod 12 are 0,1,4,5,8,9 
and the squares mod 12 are 0,1,4,9 since 0° = 0, (+1)? = 1, (+2)? = 4, (+3)? = 
9, (+4)? = 4, (+5)* = 1, and (+6)? = 0. The excluded discriminants are thus 
those congruent to 5 or 8 mod 12. Again explicit forms are easily given, the forms 
3x? +hxy —ky* with A = 12k + h? for h = 0,1,2,3. 

We could continue in this direction, exploring which discriminants have forms 
that represent a given number, but this is not really the question we want to answer, 


which is to start with a given discriminant and decide which numbers are represented 
in this discriminant. The sort of answer we are looking for, based on the various 
examples we looked at earlier, is also a different sort of congruence condition, with 
congruence modulo the discriminant rather than congruence mod 4n. So there is 
more work to be done before we would have the sort of answer we want. Nevertheless, 
the representability criterion in Proposition 6.2 is the starting point. 


Our approach will be to reduce the representation problem in discriminant A first 
to the case of representing prime powers and then to representing primes themselves. 
Here is the first step. 


Proposition 6.3. If two coprime numbers m and n are both represented in dis- 
criminant A then so is their product mn. 


Applying this repeatedly, we see that if a number n has the prime factorization 
n = p\'---p,* for distinct primes p;, and if p; is represented in discriminant A for 
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each i, then n is represented in discriminant A. 


The main ingredient in the proof of the proposition will be the following: 


Lemma 6.4. If a number x is a square mod m, and is also a square mod m, where 
mı and m, are coprime, then x is a square mod mm >. 


For example, the number 2 is a square mod 7 (since 3° = 2 mod 7) and also mod 
17 (since 6? = 2 mod 17) so 2 must also be a square mod 7-17 = 119. And in fact 
2 =11° mod 119. 


Proof: This will be a consequence of the Chinese Remainder Theorem. If x is asquare 
mod m and also a square mod m, then there are numbers a, and a, such that 
x = af mod m, and x = af mod my). If mı and m, are coprime then by the 
Chinese Remainder Theorem there is a number a that is congruent to a, mod m, 
and to a) mod m,, hence a° = aj mod m, and a° = a5 mod m,. Thus x = a° 
mod m; and mod m,. This implies x = a? mod mm, since the difference x — a” 
is divisible by both m, and m, and hence by their product m,m, since m, and m, 
are coprime. This shows that x is a square mod m,m),. Oo 


Proof of Proposition 6.3: Let m and n be coprime. At least one of them must be 
odd, say n is odd. If m and n are represented in discriminant A then A is a square 
mod 4m and mod 4n, hence also mod n. Since 4m and n are coprime, the lemma 
then says that A is a square mod 4mn, so mn is represented in discriminant A. oO 


Next we try to reduce further from prime powers to primes themselves. This is 
possible for most primes by the following more technical result: 


Lemma 6.5. If a number x is a square mod p for an odd prime p not dividing x, 
then x is also a square mod p” for each r > 1. The corresponding statement for 
the prime p = 2 is that if an odd number x is a square mod 8 then x is alsoa 
square mod 2" for each r > 3. 


For example, 2 is a square mod 7 since 2 = 3° mod 7, so 2 is also a square mod 
7°, namely 2 = 10° mod 49. It is also a square mod 7° = 343 since 2 = 108° mod 
343. Likewise it must be a square mod 7*, mod 7”, etc. The proof of the lemma will 
give a method for refining the initial congruence 2 = 3° mod 7 to each subsequent 
congruence 2 = 10° mod 49, 2 = 108° mod 343, etc. 

For the prime p = 2 we have to begin with squares mod 8 since 3 is a square 
mod 2 but not mod 4, while 5 is a square mod 4 but not mod 8. 


Proof of Lemma 6.5: We will show that if x is a square mod p” then it is also a 
square mod p’*! 
p = 2. By induction this will prove the lemma. 


, assuming r > 1 in the case that p is odd and r = 3 in the case 


We begin by assuming that x is a square mod p”, so there is a number y such 
that x = y° mod p” or in other words p” divides x — y*, say x — y* = p'l for 
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2 mod p’*!, so 


some integer l. We would like to find a number z such that x = z 
it is reasonable to look for a z with z = y mod p”, or in other words z = y + kp” 
for some k. Thus we want to choose k so that x = (y + kp”)? mod p"*!. In other 


words we want p”*! to divide x — (y + kp”)? . This can be rewritten as: 


x- (y +kp")* =x- (y? + 2kp"y + k’p””) 

a y? = 2kp"y = k?p? 

= p"l-2kp"y —k?p*" since x — y? = p'l 

= p' (L- 2ky — k*p”) 
For this to be divisible by p’*! means that p should divide l- 2ky — k?p” . Since we 
assume r > 1 this is equivalent to p dividing l—2ky , or in other words, l—2ky = pq 
for some integer q. Rewriting this as l = 2yk+ pq, we see that this linear Diophantine 
equation with unknowns k and q always has a solution when p is odd since 2y and 
p are coprime if p is odd, in view of the fact that p does not divide y since x = y’ 
mod p” and we assume x is not divisible by p. This finishes the induction step in 
the case that p is odd. 

When p = 2 this argument breaks down at the last step since the equation | = 
2yk + pq becomes l= 2yk + 2q and this will not have a solution when l is odd. To 
modify the proof so that it works for p = 2 we would like to get rid of the factor 2 
in the equation l = 2yk + pq which arose when we squared y + kp”. To do this, 
suppose that instead of trying z = y + k-2" we try z = y +k-2"!. Then we would 
want 2”*! to divide x — (y + k-2”~!)*. Again this can be rewritten: 


Cape?) exer ako yan 
= 2"1—k-2"y—k2°"* since x-y’ =271 
= 2"(l-ky —k°2"*) 
Assuming r > 3, this means 2 should divide | — ky, or in other words l = yk + 2q 


for some integer q. The number y is odd since y* = x mod 2” and x is odd by 
assumption. This implies the equation l = yk + 2q has a solution (k,q). o 


Proposition 6.6. If a prime p not dividing the discriminant A is represented by a 
form of discriminant A then every power of p is also represented by a form of 
discriminant A. 


Proof: First we consider odd primes p. If p is represented in discriminant A then 
A is asquare mod 4p and hence mod p. The preceding lemma then says that A is a 
square mod each power p”. From this it follows by the earlier Lemma 6.4 that A is 
also a square mod 4p’ since A is always a square mod 4. Thus by Proposition 6.2 
all powers of p are represented in discriminant A. 

For p = 2 the argument is almost the same. In this case the representability of 2 
implies that A is a square mod 4-2 = 8 so the lemma implies that A is also a square 
mod 4-2” for all r = 1 so all powers of 2 are represented. o 
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In the examples for the representation problem that we looked at in the preceding 
section we saw that primes that divide the discriminant behave differently from primes 
that do not, and the differences begin at this point: 


Proposition 6.7. Each prime dividing the discriminant A is represented in discrim- 
inant A. If a prime p divides A but not the conductor of A then no form of 
discriminant A represents p* or any higher power of p. 


Recall that the conductor for discriminant A is the largest positive number d such 
that A = d°A’ for some discriminant A’. This A’ is then a fundamental discriminant. 
Fundamental discriminants are those with conductor 1. 


Proof: We saw earlier that 2 is represented in all discriminants not congruent to 
5 mod 8 so in particular this includes all even discriminants. For an odd prime p 
dividing A we have A = 0 mod p so A is a square mod p, namely 0°. Since p 
is odd it follows that A is also a square mod 4p and hence p is represented in 
discriminant A. 

Suppose now that p is a prime dividing A and some form of discriminant A 
represents p°. This form is equivalent to a form p?x* + bxy + cy? with p dividing 
A = b? — 4p°c so p must divide b?. Since p is prime it must then divide b, so in 
fact p° divides b*. Therefore p° divides A = b? — 4p°c and we have A = p*A’ for 
some integer A’. 

Consider first the case that p is odd. Then p* = 1 mod 4 so A = A’ mod 4. 
This means that A’ is also a discriminant, so by the definition of the conductor, p 
divides the conductor. Thus if p divides A but not the conductor then p* cannot be 
represented by any form of discriminant A. 

In the case that p = 2 the assumption that p divides A means that A is even 
and hence so is b. The discriminant equation A = b? — 4p*c is now A = b° — 4-2°c 
so A = b? mod 16. The only squares of even numbers mod 16 are 0 and 4, as one 
sees by checking 0°, (+2)?, (+4)*, (+6)*, and (+8), so A is either 16k = 4(4k) 
or 16k + 4 = 4(4k + 1). In both cases A is 4 times a discriminant so 2 divides the 
conductor. 

Once we know that p° is not represented in discriminant A then neither is any 
multiple of p* by Proposition 6.1, and in particular higher powers of p are not rep- 
resented. o 


Here is a summary of what we have shown so far in the case of fundamental 
discriminants: 


Theorem 6.8. If A is a fundamental discriminant then the numbers n > 1 that 
are represented by at least one form of discriminant A are exactly the numbers 
that factor as a product n = ptp’ pi of powers of distinct primes p; each 
of which is represented by some form of discriminant A, with the restriction that 
e; < 1 for primes p; dividing A. 
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The situation for nonfundamental discriminants is more complicated and will be 
described later in Theorem 6.11. 


For the problem of determining which primes are represented in a given discrim- 
inant we already know when 2 is represented and we know that primes dividing the 
discriminant are always represented. After these special cases what remains are the 
odd primes not dividing the discriminant, which can be regarded as the generic case. 

An odd prime p will be represented in discriminant A exactly when A is a square 
mod p. Let us introduce some convenient notation for this condition. For p an odd 
prime and a an integer not divisible by p, define the Legendre symbol (5) by 


(4) = | +1 if ais a square mod p 
p —1 if ais not a square mod p 
Using this notation we can say: 
= An odd prime p that does not divide A is represented in discriminant A if and 
only if (5) = +1. 


It will therefore be useful to know how to compute (4). The following four basic 


properties of the Legendre symbol make this a feasible task: 


w (F)=G)@)- 
(2) (=) = +1 if p =1 mod 4 and (|) = -1 if p =3 mod 4. 


p 
(3) (5) = +1 if p = +1 mod 8 and (5) =-lif p = +3 mod 8. 
(4) If p and q are distinct odd primes then (5) = (F) unless p and q are both 
; : P\ _ q’ 
congruent to 3 mod 4, in which case (2) a -(#). 
Property (1), applied repeatedly, reduces the calculation of (S) to the calculation of 
(2) for the various prime factors q of a, along with (>) when a is negative. Note 
2 
that (3) = +1 so we can immediately reduce to the case that |a| is a product of 


distinct primes. Property (2) will be used when dealing with negative discriminants, 
and property (3) will be used for certain even discriminants. 

Property (4) is called quadratic reciprocity. This is by far the most subtle of the 
four properties, and proving it is considerably more difficult than for the other three 
properties. We will give a proof in Section 6.4, obtaining proofs of the first three 
properties along the way. 

For a quick illustration of the usefulness of these properties let us see how they 
can be used to compute the values of Legendre symbols. Suppose for example that 
one wanted to know whether 78 was a square mod 89. The naive approach would 
be to list the squares of all the numbers +1,---,+44 and see whether any of these 
was congruent to 78 mod 89, but this would be rather tedious. Since 89 is prime 


we can instead evaluate (2) using the basic properties of Legendre symbols. First 


we factor 78 to get (4) = (i) (=) (33). By property (3) we have (y) = +1 since 
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89 = 1 mod 8. Next, reciprocity gives (=) = (2) and (3) = (23) since 89 = 1 
mod 4. After this we use the fact that (4) depends only on the value of a mod p to 
reduce (2) to (4) and (3) to E). Using property (3) again, we have (4) =]; 


confirming the obvious fact that 2 is not a square mod 3. For ( ) , reciprocity says 


11` 
13 
this equals 2). This reduces to ( Z ) = —1. Summarizing, we have: 


IL 


D-HO- oww- 
Thus we see that 78 is a square mod 89, even though we have not found an actual 
number x such that x* = 78 mod 89. 

In this example we used the fact that the modulus 89 was prime, but we have 
already seen how to reduce to the case of prime moduli. For example, if we wanted 
to determine whether 78 is a square mod 88 we know this is the case exactly when it 
is a square mod 8 and mod 11. The squares mod 8 are 0, 1, and 4 whereas 78 = 6 
mod 8 so 78 is not a square mod 8 and therefore not mod 88 either, even though 
78 = 1 mod 11 so 78 is a square mod 11. 


Returning now to quadratic forms, let us see what the basic properties of Legendre 
symbols tell us about which primes are represented by some of the forms discussed 
at the beginning of the chapter. In the first four cases the class number is 1 so we will 
be determining which primes are represented by the given form, and Theorem 6.8 
will then say exactly which numbers are represented by this form, confirming the 
conjectures made when we looked at the topographs. 


Example: x? + y? with A = —4. This form obviously represents 2, the only prime 

dividing A, and it represents an odd prime p exactly when (+) = +1. Using the first 
-4\ - (DAA =- (Z 

of the four properties we have (>) = ( p ) (5) (5) = ( p ) , and the second property 

says this is +1 exactly for primes p = 4k + 1. Thus we see the primes represented 


by x? + y? are 2 and the primes p = 4k +1. 


Example: x? + 2y? with A = —8. Again the only prime dividing A is 2, and it 
is represented. For odd primes p we have (+) = (>) (5) = (>) (5). In the four 


cases p = 1,3,5,7 mod 8 this is, respectively, (+1)(4+1), (-1)(-1), (4+1)(-1), and 
(—1)(+41). We conclude that the primes represented by the form x° + 2 y? are 2 and 
primes congruent to 1 or 3 mod 8. 


Example: x° — 2 y? with A = 8. The only prime dividing A is 2 which is represented 
when (x,y) = (2,1). For odd primes p we have ($) = (5) = 5) so property (3) 


implies that the primes represented by x? — 2y° are 2 and p = +1 mod 8. 


Example: x° + xy + y° with A = —3. The only prime dividing the discriminant is 3 
and it is represented. The prime 2 is not represented since A = 5 mod 8. For primes 
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p > 3 we can evaluate (=>) using quadratic reciprocity: 


(=3) = (<4) (8) Sea ifp=4k+1 

(-1)(-(4)) ifp=4k+3 
So we get (£) in both cases. Since (£) only depends on p mod 3, we have (2) = +1 
if p = 1 mod 3 and (£) =-1 if p = 2 mod 3. (Since p + 3 we do not need to 
consider the possibility p = 0 mod 3.) The conclusion is that the primes represented 


by x? + xy + y? are 3 and the primes p = 1 mod 3. 


Example: A = 40. Here all forms are equivalent to either x° — 10y° or 2x* —5y°. 

The primes dividing 40 are 2 and 5 so these are represented by one form or the 

other, and in fact both are represented by 2x* — 5y° as the topographs showed. For 

other primes p we have (2) = AG) = (5) (£) The factor (5) depends only on 
p p/ \P PI\5/" p 


p mod 8 and (£) depends only on p mod 5, so their product depends only on p 
mod 40. The following table lists all the possibilities for congruence classes mod 40 


not divisible by 2 or 5: 
1 3 7 9 11 13 17 19 21 23 27 29 31 33 37 39 


($) | +1 -1 +1 +1 -1 -1 +1 -1 -1 +1 -1 -1 +1 +1 -1 +1 


(2) | +1 -1 -1 +1 +1 -1 -1 +1 +1 -1 -1 +1 +1 -1 -1 41 


The product (5) (£) is +1 in exactly the eight cases p = 1,3,9,13,27,31,37,39 
mod 40. We conclude that these are the eight congruence classes containing primes 
(other than 2 and 5) represented by one of the two forms x° — 10y° and 2x° - 57°. 
This agrees with our earlier observations based on the topographs. However, we have 
yet to verify our earlier guesses as to which congruence classes are represented by 


which form. We will see how to do this in the next section. 


In the examples above we were able to express (4) in terms of Legendre symbols 


p 
(>) l (5) , and (F) for odd primes p; dividing A. The following result shows that 


this can be done for all A: 


Proposition 6.9. Let the nonzero integer A be factored as A = €2°p,---p, for 
€ = +1, s > 0, and each p; an odd prime. (We allow k = 0 when A = €2°.) Then 
for odd primes p not dividing A the Legendre symbol (5) has the value given in 
the following table: 


2?l(4m +1) F) Ei a 
2am +3) | D 
2241 (4m +1) 3) (Fe) ++ GE) 
Pham) | Gr) Gp) Ge) Ee) 
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Proof: For A = €2° Pı ++ pg Quadratic reciprocity gives 


A) _ (£)(2) (#1)... (Bk) = (£) (2) (@) (®).... (2 

(P) =) GG) GF) GA = Be) GG) 
where w is +1 or —1 according to whether there are an even or an odd number of 
factors p; = 3 mod 4. The exponent s in this formula can be replaced by 0 or 1 
according to whether s is even or odd. In the first and third rows of the table the odd 
part of A is 4m + 1 so we have € = w and therefore (5) (S) = 1. In the second and 


P/\P 
fourth rows the factor 4m + 1 is replaced by 4m + 3 and we have € = —w, hence 


E\(w\ _ (1 
E) =) D 
Corollary 6.10. The representability of an odd prime p in discriminant A depends 
only on the congruence class of p mod A. 


Proof: The class of p mod A determines its class mod p; for each i and this deter- 
mines (Z) . For the terms (>) and (5) in the last three rows of the table, note first 
l 
that / must be at least 1 in these rows since A is a discriminant. In the second row 
1 


the class of p mod A determines its class mod 4 so it determines (>). In the third 


and fourth rows the class of p mod A determines its class mod 8 so both (>) and 
(5) are determined. Thus in all cases the factors of (5) are determined by the class 
of p mod A so (5) is determined. o 


Our next result generalizes Theorem 6.8 to cover all discriminants. As one can 
see, the general statement is considerably more complicated than for fundamental 
discriminants. 


Theorem 6.11. A number n > 1 is represented by at least one form of discriminant 
A exactly when n factors as a product n = p{'ps’---p;* of powers of distinct 
primes p; each of which is represented by some form of discriminant A, where 
e; < 1 for primes p; dividing A but not the conductor, while for primes p = p; 
dividing the conductor the allowed exponents e = e; are given by the following 
rules. First write A = p°q with p° the highest power of p dividing A. Then if p is 
odd the allowable exponents e are those for which either 

(a) e<s or 

(b) e >s, s is even, and (2) = +1. 
If p = 2 then the allowable exponents e are those for which either 

(a)e<s-2 or 

(b) s is even and e is as in the following table: 


q mod 8 1 3 5 7 


e al <s-1 <s <s-l 


Examples will be given following the proof. The main part of the proof is con- 
tained in a lemma: 
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Lemma 6.12. Suppose that a number x divisible by a prime p factors as p*q where 

p does not divide q, so p° is the largest power of p dividing x. Then: 

(a) x is a square mod p” foreachr <s. 

(b) Ifr >s and s is odd then x is not a square mod p”. 

(c) Ifr >s and s is even then x is a square mod p” if and only if q is a square 
mod p°. 


Proof: Part (a) is easy since x is 0 mod p* hence also mod p” if r < s, and 0 is 
always a square mod anything. 

For (b) we assume r > s and s is odd. Suppose p*q is a square mod p”, so 
pq = y? + lp” for some integers y and l. Then p° divides y* + lp” and it divides 
lp” (since r > s) so p“ divides y?. Since s is assumed to be odd and the exponent of 
p in y? mustbe even, this implies p**! divides y’. It also divides lp” since s+1 <r, 
so from the equation p°q = y* + lp” we conclude that p divides q, contrary to the 
definition of q. This contradiction shows that p*q is not a square mod p” when 
r >s and s is odd, so statement (b) is proved. 

For (c) we assume r > s and s is even. As in part (b), if p*q is a square mod p” 
we have an equation p*q = y* + lp” and this implies that p° divides y*. Since s 
is now even, this means y* = p*z* for some number z. Canceling p* from p°q = 
y? + lp" yields an equation q = z* + lp’~*, which says that q is a square mod p’ ©. 
Conversely, if q is a square mod p’* we have an equation q = z* + lps and hence 
p°q = p°z? + lp” . Since s is even, this says that p*q is a square mod p”. Oo 


Proof of Theorem 6.11: As in the proof of Theorem 6.8 the question reduces to rep- 
resenting powers of primes. We know from Proposition 6.6 that all powers of a prime 
not dividing the discriminant A are represented if the prime itself is represented. By 
Proposition 6.7 we also know that primes p dividing A are represented, and their 
powers p° with e > 1 cannot be represented unless p divides the conductor. For the 
remaining case of primes dividing the conductor we will apply the preceding lemma 
with x = A. 

For odd p dividing A we need to determine when A is a square mod p°. By the 
lemma the times this happens are when e < s, or when e > s and s is even and q is 
a square mod p* *. When e > s this last condition amounts just to q being a square 
mod p by Lemma 6.5, or in other words 4) =+1. 

When p = 2 we need to determine when A is a square mod 4-2° = 2°*?. By the 
lemma this happens only when e < s — 2 or when s is even and q (which is odd) is 
a square mod 2°**-S. If e = s — 1 then e+2-—s = 1 and every q is a square mod 
2°*2-s 2 Ife = s then e + 2 -s = 2 and q is a square mod 2°** = 4 only when 
q=4k+1. Andif e > s+ 1 then e+2-s > 3 and q is a square mod 2°**~* only 


when it is a square mod 8, which means q = 8k +1. o 
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Let us look at two examples illustrating some of the more subtle possibilities 
in the preceding theorem. The examples involve the rather simple forms x° + ny” 
whose discriminant —4n is sometimes not a fundamental discriminant such as when 
n is congruent to 3 mod 4. The examples will be the cases n = 3,7. 

Example: A = —12 with conductor 2. The two forms here are Q, = x° + 3y” and 
the nonprimitive form Q, = 2x? IM Oy”: 


Q(x, Y) = x*+3y? 91 


The primes represented in discriminant —12 are 2, 3, and primes p with =) = 


(>) = (+) () = (£) = +1, so these are the primes p = 1 mod 3. By Theorem 6.11 
the numbers represented in discriminant —12 are the numbers n = 243? Pı’: Pk 
with a < 2, b < 1, and each p; a prime congruent to 1 mod 3. (When we apply the 
theorem for p; = 2 we have s = 2 and q = —3.) We can in fact determine which 
of Q, and Q, is giving these representations. The form Q, is twice xL + xy ty? 
and we have already determined which numbers the latter form represents, namely 
the products 3? p; -++p, with b < 1 and each prime p; = 1 mod 3. Thus, of the 
numbers represented by Q, or Q», the numbers represented by Q, are those with 
a = 1. None of these numbers with a = 1 are represented by Q, since x? + 3y* is 
never 2 mod 4, as xê and y* must be 0 or 1 mod 4. 

Example: A = —28 with conductor 2 again. Here the only two forms up to equiva- 
lence are Q} = x? + 7y* and Q, = 2x* + 2xy + 4y* which is not primitive. 


Q(x, Y) = x? +7y? 


113 


The primes represented in discriminant —28 are 2, 7, and odd primes p with (=) 


(>) (5) = (2) = +1 so p = 1,2,4 mod 7. According to Theorem 6.11 the numbers 
represented by Q, or Q, are the numbers n = 207) op: -++p, with b < 1 and each 
pi an odd prime congruent to 1, 2, or 4 mod 7. There is no restriction on a since 


when we apply the theorem with p; = 2 we have s = 2 and q = -7 = 8l + 1. 
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We can say exactly which numbers are represented by Q, since it is twice the 
form x? + xy + 2y° of discriminant —7, which is a fundamental discriminant of 
class number 1 so Theorem 6.8 tells us which numbers this form represents. These 
are the numbers 7p, -++p, with b < 1 and primes p; = 1,2,4 mod 7, including 
now the possibility p; = 2. Thus Q, represents exactly the numbers 247P Pı’ Pk 
with a > 1, b < 1 and odd primes p; = 1,2,4 mod 7. Hence Q, must represent 
at least the numbers 207 ii -++p, with a = 0, b < 1, and odd primes p; = 1,2,4 
mod 7. These numbers are all odd since a = 0, but Q, also represents some even 
numbers since x? + 7y° is even whenever both x and y are odd. 

From the topograph we might conjecture that Q, represents exactly the numbers 
D7 Hi .-- -pg with a + 1,2 and the same conditions on b and the primes p; as 
before. For example one can see that 8, 16, 32, 64, and 128 are represented. It is 
not difficult to exclude a = 1 and a = 2 by considering the values of x? + 7? mod 4 
and mod 8. To see that Q, represents all the predicted numbers with a > 3 we use 
the following result. 


Proposition 6.13. For a prime p, if a product p*q with k > 0 is represented by a 
primitive form of discriminant A then p***q is represented by a primitive form of 
discriminant p*A. 


Applying this to the case at hand with p = 2, the form x° + xy +2y* represents 
all the products 297 po, -+ +p, as above with a = 1, so x? + 7y" represents all these 
products with a = 3. 


Proof: Suppose we have a primitive form of discriminant A representing p*a, so the 
topograph of this form has a region labeled p*q. If k > O then at least one of the 
regions adjacent to this region must have a label not divisible by p , otherwise a vertex 
in the boundary of this region would have all three adjacent labels divisible by p so 
the form would be p times another form, making it nonprimitive. Thus the given 
form is equivalent to a form p*qx* + bxy +cy* with c not divisible by p. The form 
p***qx* + pbxy + cy? has discriminant p*A and is primitive since its coefficients 
are not all divisible by p , nor are they divisible by any other prime since such a prime 
would have to divide q, b, and c making the previous form p*qx* + bxy + cy? 
nonprimitive. Oo 


For nonfundamental discriminants Theorem 6.11 says nothing about whether the 
representing forms are primitive. As we will see in Theorem 7.7, determining the 
numbers represented by primitive forms of a given discriminant also reduces to the 
special case of representing prime powers by primitive forms. Namely, a product of 
powers p“ of distinct primes p; is represented by a primitive form exactly when each 
of the prime powers p;“ is represented by a primitive form. Most prime powers are 
represented only by primitive forms, according to the following easy result: 
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Proposition 6.14. A form of discriminant A representing a power p* of a prime p 
not dividing the conductor of A is primitive. 


Proof: If a form Q representing pk is not primitive it is a multiple of another form 
by some integer d > 1. This number d divides every number represented by Q so in 
particular d divides p“ and hence p divides d. Since d divides the conductor, this 
means that p divides the conductor. Thus if p does not divide the conductor then Q 
must be primitive. o 


For primes dividing the conductor one can get some idea of the complications 
that can occur from the table on the next page. This lists all the equivalence classes of 
forms, both primitive and nonprimitive, for nonfundamental negative discriminants 
up to —99, along with the prime powers p* represented by these forms for primes 
p dividing the conductor d. To save space the table uses the abbreviated notation 
[a,b,c] for the form ax? + bxy + cy°. 

Some information in the table can be deduced from the earlier Proposition 6.13, 
such as the fact that if nonprimitive forms of a given discriminant represent all powers 
pk with k = 1 then primitive forms of that discriminant represent all powers p* with 
k > 3. This statement is optimal for some discriminants such as —28 and —60 but 
not for others such as —72 and —99 where p° is also represented by a primitive form. 

In the table one can see that primitive forms represent powers of primes dividing 
the conductor but not these primes themselves. As we will show in Proposition 6.15, 
a prime can only be represented by a single equivalence class of forms of a given dis- 
criminant, and a prime p dividing the conductor for discriminant A is represented by 
p times the principal form of discriminant M/p2 , SO p is represented by a nonprimi- 
tive form and hence cannot also be represented by a primitive form. The uniqueness 
of forms representing primes holds also for powers of primes that do not divide the 
conductor, but we see from the table that this uniqueness may not hold for primes 
that do divide the conductor, even if we restrict attention just to primitive forms, as 
for example in the case A = —32 where 2? is represented by two nonequivalent prim- 
itive forms, or discriminants —72 and —99 where there are infinitely many different 
powers pk represented by different primitive forms. 

The entries in the table where Theorem 6.11 says that only finitely many powers 
p* are represented can be checked just by drawing topographs, but in the other cases 
one must use general theory. We already explained the first case A = —28 in the 
earlier analysis of the form x? + 7y7. For the next case A = —60 the methods in the 
next section will suffice. A technique for handling the last few cases in the table will 
be explained at the end of Chapter 8. 
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In the previous section we saw examples where two nonequivalent forms of the 
same discriminant both represent the same number. However, this does not happen 
for representations of 1 or primes or powers of most primes: 


Proposition 6.15. If Q} and Q, are two forms of the same discriminant that both 
represent the same prime p or both represent 1, then Q, and Q, are equivalent. 
The same conclusion holds when Q, and Q, both represent the same power p* of 
an odd prime p that does not divide the discriminant. 


The last statement is also true for p = 2 but the proof is more difficult so we will 
wait until the next chapter to deduce this from a more general result, Theorem 7.7. 
Examples showing that powers of primes dividing the discriminant can be represented 
by nonequivalent forms of the same discriminant can be found in the table on the 
previous page. In these examples the prime in question divides the conductor, not 
just the discriminant, but this has to be the case since for primes p dividing the 
discriminant but not the conductor the only power pk represented by a form of the 
given discriminant is p itself, by Proposition 6.7. 


Proof: Suppose that Q is a form representing a number p that is either 1 or a prime. 
The topograph of Q then has a region labeled p, and we have seen that the h-labels 
on the edges adjacent to this p -region form an arithmetic progression with increment 
2p when these edges are all oriented in the same direction. We have the discriminant 
formula A = h* — 4pq where h is the label on one of these edges and q is the 
value of Q for the region on the other side of this edge. Since p is nonzero the 
equation A = h* — 4pq determines q in terms of A and h. This implies that A 
and the arithmetic progression determine the form Q up to equivalence since the 
progression determines p, and any h-value in the progression then determines the 
q-value corresponding to this h-value, so Q is equivalent to px* + hxy +qy°. 

In the case that p = 1 the increment in the arithmetic progressions is 2 so the 
two possible progressions of h-values adjacent to the p -region are the even numbers 
and the odd numbers. We know that h has the same parity as A, so A determines 
which of the two progressions we have. As we saw in the preceding paragraph, this 
implies that the form is determined by A, up to equivalence. 

Now we consider the case that p is prime. Let Q, and Q, be two forms of the 
same discriminant A both representing p. For Q, choose an edge in its topograph 
adjacent to the p-region, with h-label h} and q-label q,. For the form Q, we simi- 
larly choose an edge with associated labels h, and qp. Both h, and h, have the same 
parity as A. We have A = hî — 4pq, = h5 — 4pq, and hence hî = h3 mod 4p. This 
implies h? = h$ mod p, so p divides hê —h5 = (h; +h»)(h, —h,). Since p is prime, 
it must divide one of the two factors and hence we must have h; = +h, mod p. By 
changing the orientations of the edges in the topograph for Qı or Q, if necessary, 
we can assume that h; = h) mod p. 
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If p is odd we can improve this congruence to h, = h, mod 2p since we know 
that h, — h, is divisible by both p and 2 (since h, and h, have the same parity), 
hence h} — h, is divisible by 2p. The congruence h, = h, mod 2p implies that the 
arithmetic progression of h-values adjacent to the p-region for Q, is the same as 
for Q» since 2p is the increment for both progressions. By what we showed earlier, 
this implies that Q, and Q, are equivalent. 

When p = 2 this argument needs to be modified slightly. We still have hi = h3 
mod 4p so when p = 2 this becomes hî = h5 mod 8. Since 2p = 4 the four possible 
arithmetic progressions of h-values are h = 0, 1, 2, or 3 mod 4. We can interchange 
the possibilities 1 and 3 just by reorienting the edges, leaving only the possibilities 
h = 0, 1, or 2 mod 4. These are distinguished from each other by the congruence 
hî = hő mod 8 since (4k)* = 0 mod 8, (4k + 1)? = 1 mod 8, and (4k + 2)? = 4 
mod 8. 

Finally we have the case that Q, and Q, both represent the power pk of an odd 
prime p not dividing A, with k > 1. Following the line of proof above we see that 
p* divides hî — hî = (h, + h»)(h, — hy). If p* divides either factor we can proceed 
exactly as before to show that Q, and Q, are equivalent since we assume p is odd, 
hence also p*. If p* does not divide either factor then both factors are divisible by 
p, hence p divides their sum 2h,. Since p is odd this implies that p divides h,, 
and so p divides A = hî — 4p*q,. Thus if p does not divide A then the case that p* 
divides neither hı + h, nor h} — h, does not arise. Oo 


The same argument shows another interesting fact: 


Proposition 6.16. If the topograph of a form has two regions with the same label 
n where n is either 1, a prime, or a power of an odd prime not dividing the dis- 
criminant, then there is a symmetry of the topograph that takes one region labeled 
n to the other. Similarly, for positive discriminants and for the same numbers n, 
if there is one region labeled n and another labeled —n then there is a skew sym- 
metry taking one region to the other. 


Proof: Suppose first that there are two regions having the same label n. As we saw 
in the proof of the preceding proposition, each of these regions is adjacent to an edge 
with the same label h and hence the labels q across these edges are also the same. 
This means there is a symmetry taking one region labeled n to the other. 

The other case is that one region is labeled n and the other —n. The topographs 
of the given form Q and its negative —Q then each have a region labeled n so there 
is an equivalence from Q to -Q taking the n-region for Q to the n-region for -Q. 
This equivalence can be regarded as a skew symmetry of Q taking the n-region to 
the —n-region. oO 
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We can be more specific about forms that represent prime divisors of the discrim- 
inant, and more generally divisors of the discriminant that are products of distinct 
primes: 


Proposition 6.17. Let a be a positive squarefree number dividing the discrimi- 
nant A. Then a is represented by a unique equivalence class of forms of discrim- 
inant A, namely by a form ax* + cy? or ax? +axy +cy*. Moreover a appears 
in the topographs of these forms only on a reflector line of a mirror symmetry. 


When we studied symmetries of topographs in Section 5.4 we saw that a form 
with mirror symmetry and with a label a on the reflector line is always equivalent to a 
form ax*+cy* or ax* +axy+cy", with a dividing the discriminant in both cases. 
However a need not be squarefree, as one can see in the case A = —36 where there 
are three equivalence classes of forms: 


x*+9y? 2x?+2xy +57? 3x? + 3y? 


The first two topographs have a single reflector line while the third has two reflector 
lines. The squarefree positive divisors of the discriminant are 1, 2,3,6 and these each 
appear in a unique topograph, always on a reflector line. The non-squarefree divisors 
9 and 18 appear in two topographs, once on a reflector line and once not on a reflector 
line in each case. The remaining non-squarefree divisors 4,12,36 do not appear in 
any of the topographs. 

Examples like this can only occur when A is not a fundamental discriminant 
since divisors of a fundamental discriminant can only be represented when they are 
squarefree, as we saw in Theorem 6.8. 


Proof of Proposition 6.17: We know from Theorem 6.11 that each squarefree divisor 
a of A is represented by some form Q of discriminant A, so a appears in the to- 
pograph of Q. (This can also be deduced just from Lemma 6.4 and Proposition 6.7.) 
If b is one of the labels on an edge bordering the region labeled a then we have 
A = b° — 4ac for c the label on the other region adjacent to the b edge. Since we 
assume a divides A = b? — 4ac it must also divide b°, and if a is squarefree it will 
therefore divide b. Thus we have b = ma for some integer m. The labels on the 
edges bordering the a region form an arithmetic progression with increment 2a so 
these are the numbers b + 2ka as k ranges over all integers. Since b = ma we can 
factor b + 2ka as (m + 2k)a. The numbers m + 2k for varying k form an arith- 
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metic progression consisting of all even numbers if m is even and all odd numbers 
if m is odd. Thus we can choose k so that m + 2k is either 0 or 1, and hence the 
arithmetic progression (m + 2k)a contains either 0 or a. This means one of the 
edge labels on the border of the a region is 
either 0 or a, with the topograph near this 


2at+c 


edge having the shape shown in one of the 
figures at the right. From this we see that 
there is a reflector line passing through the 


a region and the form is equivalent to either 
ax*+cy* or ax? +axy+cy’. 

To finish the proof we only need to see that there cannot be two forms ax? +cy* 
and ax? + axy + c' y? with the same a and the same discriminant. Equating the 
discriminants of these two forms, we would have —4ac = a° — 4ac’ and therefore 
a = 4(c’ —c), but a would then be divisible by 4 and thus not squarefree. o 


For the last result in this section we will use a variant of Euclid’s proof that there 
are infinitely many primes to prove the following general statement: 


Proposition 6.18. For each discriminant A the set of primes represented in discrim- 
inant A is infinite. 


Proof: In each discriminant A there is a form Q(x, y) = x? + bxy+c y? represent- 
ing 1. We can assume c is nonzero since in the topograph of Q there will always be 
at least one region adjacent to the 1 region that is not labeled by 0. (Only parabolic 
and 0-hyperbolic forms can have a 0 region and they have at most two 0 regions.) Let 
Pı» ***, pg be any finite list of primes. We allow repetitions on this list so we can make 
k as large as we like just by repeating some p; often enough. Let P be the product 
Pı -pg and consider the number n = Q(1,P) = 1 +bP+cP*. This is represented by 
Q since (1, P) is a primitive pair. If k is large enough we will have |n| > 1 since IcP*| 
will be much larger than |1 + bP|. Any prime p dividing n will also be represented 
by some form of discriminant A. This p must be different from any of the primes 
p;i on the initial list since dividing p; into n = 1+ P+ cP? gives a remainder of 1, 
whereas p divides n evenly. Thus we have shown that for any finite list of primes 
there is another prime not on the list that is represented in discriminant A. Hence 
the set of primes represented in discriminant A must be infinite. o 


Exercises 


1. Determine discriminants A for which there exists a quadratic form of discriminant 
A that represents 5, and also the discriminants for which there does not exist a form 
representing 5. When 5 is represented, find a form that gives the representation. 
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2. The following is a generalization of Lemma 6.4. Let P(x) be a polynomial with 
integer coefficients and let n be an integer. Show that if the congruence P(x) =n 
has a solution mod m; and also a solution mod m, where m, and m, are coprime, 
then it has a solution mod m,m,. Give an example where this fails without the 
coprimeness condition. 


3. Verify that the statement of quadratic reciprocity is true for the following pairs of 
primes (p,q): (3,5), (3,7), (3,13), (5,13), (7,11), and (13,17). 


4. Evaluate the following Legendre symbols: (4), =), (are). 


5. Show that (4) can always be computed just from the four basic properties of 


Legendre symbols. 
6. Determine which numbers in the range from 40 to 50 are squares mod 132. 


7. (a) Using quadratic reciprocity determine which primes are represented by some 
form of discriminant 17. 

(b) Show that all forms of discriminant 17 are equivalent to the form x* + xy - 4y°. 
(c) Draw enough of the topograph of x? + xy — 4y? to show all values between —70 
and 70, and verify that the primes that occur are precisely the ones predicted by your 
answer in part (a). 


8. Determine which primes are represented by at least one form of the following 
discriminants: (a) 21 (b) -19 (c) —20 (d) —24. 


9. Show that every prime is represented by at least one of the forms x° +y?, x? +2y°, 
and x° S297": 
10. Consider forms Q = ax? +bxy +cy? of discriminant A. Show that the following 
three conditions are equivalent: 

(1) The coefficients a, b, and c of Q are all odd. 

(2) Q represents only odd numbers. 

(3) A=5 mod 8. 


11. For which fundamental discriminants A is there a form of discriminant A repre- 
senting |A|? What about nonfundamental discriminants? 


12. In terms of their prime factorizations, which numbers are sums of two nonzero 
squares? Which squares are sums of two nonzero squares? 


13. Show that if the form x? + ny? represents 2* with n odd and k > 0 then n =7 
mod 8 except when (n,k) = (1,1) and (3,2). 


14. Show that for each prime p dividing the conductor for discriminant A there is at 
least one primitive form of discriminant A that represents a power of p. Hint: Use 
induction on the highest power of p dividing the conductor, along with Theorem 6.11 
and Propositions 6.13 and 6.14. 
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6.3 Genus and Characters 


In the previous section we obtained a reasonably complete answer to the ques- 
tion of which numbers are represented in a given discriminant. One determines which 
primes are represented using Legendre symbols, and in a fairly simple way this de- 
termines which nonprimes are represented. For discriminants of class number 1 this 
gives a complete answer to the question of which forms represent which numbers. 

The main goal of the present section is to see how Legendre symbols, along with a 
few other things like them, can give additional information when the class number is 
not 1. In particular, in favorable cases we will be able to determine fully which forms 
represent which primes. Underlying this method is the following basic result: 


Proposition 6.19. Let Q be a form of discriminant A and let p be an odd prime 
dividing A. Then the Legendre symbol ($F) has the same value for all numbers n 
in the topograph of Q that are not divisible by p. 


Before proving this let us see how it applies in the case A = 40 with p = 5. The 
class number here is 2 corresponding to the forms x° — 10y* and 2x* - 57°. 


According to the proposition, for each of the two forms the value of (4) must be the 
same for all numbers n in the topograph not divisible by 5. To determine the value 
of (5) for each form it therefore suffices to compute it for a single number n. The 


simplest thing is just to compute it for (x,y) = (1,0) or (0,1). Choosing (1,0), for 
x°? —10y° we have (=) = +1 and for 2x* —5y” we have (=) = —1. The proposition 
then says that all numbers n in the topograph of Mm 10y? not divisible by 5 have 
(4) = +1, hence n = +1 mod 5, while for 2x* — 5y* we have (4) = —1, hence 


n = +2 mod 5. Thus the last digits of the numbers in the topograph of x° — 107" 
must be 0, 1, 4, 5, 6, or 9 and for 2x*—5y* the last digits must be 0, 2, 3,5, 7, or 8. 
Note that the congruences n = +1 and n = +2 mod 5 are consistent with the fact 
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that for both forms the negative values are just the negatives of the positive values. 
(The proposition holds for negative as well as positive numbers in topographs.) 
We know that (>) = (5) (2) must equal +1 for primes p + 2,5 represented by 
either form, so for x° — 10y° this product must be (+1)(+1) while for 2x* — 5y° it 
must be (—1)(-1). 

1 3 7 9 11 13 17 19 21 23 27 29 31 33 37 39 


=) +I =I +1 +1 -1 -L +1 ee +1 +1 -1 +L +1 -1 +1 


(2) | +1 -1 -1 +1 +1 -1 -1 +1 +1 -1 -1 +1 +1 -1 -1 41 
Qi Q? Qı Q2 Q2 Qı Q2 Qı 


From the table we can see exactly which primes each of these two forms represents, 


namely x° — 10y? represents primes p = 1,9,31,39 mod 40 while 2x° — 5y° rep- 
resents primes p = 3,13,27,37 mod 40. 


Proof of Proposition 6.19: For an edge in the topograph labeled h with adjacent 
regions labeled n and k we have A = h? — 4nk. If p is a prime dividing A this 
implies that 4nk = h* mod p. Thus if neither n nor k is divisible by p and p is 


odd then the Legendre symbol (+55) is defined and (5) = a = +1. Since 
(24) = (5) (F) ($) and (5) = +1 this implies (=) = (5). In other words, the 


symbol (=) takes the same value on any two adjacent regions of the topograph of Q 
labeled by numbers not divisible by p. To finish the proof we will use the following 
fact: 


Lemma 6.20. Given a form Q and a prime p dividing the discriminant of Q, then 
any two regions in the topograph of Q where the value of Q is not divisible by p 
can be connected by a path passing only through such regions. 


Assuming this, Proposition 6.19 easily follows since we have seen that the value 
of (F) is the same for any two adjacent regions with label not divisible by p. o 


Proof of the Lemma: Let us call regions in the topograph of Q whose label is not 
divisible by p good regions, and the other regions bad regions. We can assume that 
at least one region is good, otherwise there is nothing to prove. What we will show 
is that no two bad regions can be adjacent. Thus a path in the topograph from one 
good region to another cannot pass through two consecutive bad regions, and if it 
does pass through a bad region then a detour around this region allows this bad 
region to be avoided, creating a new path passing 

through one fewer bad region as in the figure at the 

right. By repeating this detouring process as often 

as necessary we eventually obtain a path avoiding 

bad regions entirely, still starting and ending at the 

same two given good regions. 
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To see that no two adjacent regions are bad, suppose this is false, so there are 
two adjacent regions whose Q values n and k are both divisible by p. If the edge 
separating these two regions is labeled h then we have an equation A = h?—4nk, and 
since we assume p divides A this implies that p divides h as well as n and k. Thus 
the form nx? + hxy + ky’, which is equivalent to Q, is equal to p times another 
form. This implies that all regions in the topograph of Q are bad. This contradicts 
an earlier assumption so we conclude that there are no adjacent bad regions. o 


A useful observation is that the value of (2) for numbers n in the topograph of 
a form ax* + bxy + cy? with discriminant divisible by p can always be determined 
just by looking at the coefficients a and c. This is because a and c appear in adjacent 
regions of the topograph, so if both these coefficients were divisible by p , this would 
imply that b was also divisible by p since p divides b? — 4ac, so the whole form 
would be divisible by p. Excluding this uninteresting possibility, we see that at least 


one of a and c is not divisible by p and we can use this to compute (F) : 


Let us look at another example, the discriminant A = —84 = —2°-3-7 with three 
different prime factors. For this discriminant there are four equivalence classes of 
forms: Q, = De 217°, Q = 3x? BT, Q3 = 2x* + 2xy +117, and Q, = 
5x? +4xy + 5y°. The topographs of these forms were shown earlier in the chapter. 
To see which odd primes are represented in discriminant —84 we compute: 


SH) - MAAR -WMaF-WMAaAG) 
As in the example of A = 40 we can make a table of the values of these Legendre 
symbols for the 24 numbers mod 84 that are not divisible by the prime divisors 
2,3,7 of 84. Using the fact that the squares mod 3 are (+1)* = 1 and the squares 
mod 7 are (+1)* = 1, (+2)* = 4, and (+3)? = 2, we obtain the following table: 
1 5 11 13 17 19 23 25 29 31 37 41 


(3) | -1 -1 +1 -1 -1 +1 +1 -1 -1 +1 -1 -1 
(2) [+1 -1 -1 +1 -1 +1 -1 +1 -1 +1 41 -1 


(2) | +1 -1 +1 -1 -1 -1 +1 +1 +1 -1 +1 -1 
Q2 Q; 


The twelve cases when the product (>) (2) (2) is +1 give the congruence classes 


of primes not dividing A that are represented by one of the four forms, and we can 
determine which form it is by looking at the values of (£) and (2) for each of the four 
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forms. As noted earlier, these values can be computed directly from the coefficients 


of x° and y° that are not divisible by 3 for (2) or by 7 for (2). For example, for 


3 
Q = 3x? +7y° the coefficient of y? tells us that (2) = (4) = +1 and the coefficient 


of x° tells us that (2) = (2) = —1. Thus the pair (2), (2) is +1,-1 for Q. na 
similar way we find that (2), (2) is +1,+1 for Q, = x? + 217°, while it is —1, +1 


for Q; = 2x°+2xy+1ly* and -1,-1 for Q, = 5x? + 4xy + 5y°. This allows us 
to determine which congruence classes of primes are represented by which form, as 
-1 


indicated in the table, since the product (>) (2) (2) must be +1. 


Another case we looked at was A = —56 where there were three inequivalent 
forms Q, = x? + 14y*, Q = 2x7 +7y*, and Q; = 3x* + 2xy + 5y”. Here we have 
(55) = (>) (5) (5) = (5) (2) . The table of values for these Legendre symbols for 
congruence classes of numbers mod 56 not divisible by 2 or 7 is: 

1 3 5 9 11 13 15 17 19 23 25 27 
=) [+1 -1 -1 +1 -1 -1 +1 +l -1 +1 41 -1l 


( 
(2) | +1 -1 -1 +1 +1 -1 +1 -1 -1 +1 +1 -l 
(2:) Qs Qs (d:) Qs (G2) 8 (G2) a) 8 
29 31 33 37 39 41 43 45 47 51 53 55 


($) | -1 +1 +1 -1 +1 +1 -1 -1 +1 -1 -1 +1 
(2) | +1 -1 -1 +1 +1 -1 +1 -1 -1 41 +1 -1 
(2) Qs 
From the table we see that () (2) is (+1)(41) for p = 1,9, 15, 23, 25,39 mod 56 and 


(-1)(-1) for p = 3,5,13,19,27,45 mod 56. Thus the primes that are represented 
in discriminant —56 are the primes in these twelve congruence classes, along with 2 
and 7, the prime divisors of 56. Moreover, since (2) has the value +1 for numbers in 
the topographs of Q, and Q, not divisible by 7, and the value —1 for numbers in the 
topograph of Q; not divisible by 7, we can deduce that primes p = 1,9, 15, 23, 25,39 
mod 56 are represented by Q, or Q, while primes p = 3,5,13,19,27,45 mod 56 
are represented by Q}. However the values of the Legendre symbols in the table do 
not allow us to distinguish between Q, and Qb. 


Each row in one of the tables above can be regarded as a function assigning a 
number +1 to each congruence class of numbers n coprime to the discriminant A. 
Such a function is called a character and the table is called a character table. There 
is one column in the table for each congruence class of numbers coprime to A so the 
number of columns is œ (|A|) where œ is the Euler phi function from Section 2.3. For 
each odd prime p dividing A there is a character given by the Legendre symbol ) ; 
There is sometimes also a character associated to the prime 2 in a somewhat less 
transparent way. In the example A = —84 this is the character defined by the first 
row of the table, which assigns the values +1 to numbers n = 4k + 1 and -1 to 
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numbers n = 4k + 3. We will denote this character by x, to indicate that its values 
x(n) = +1 depend only on the value of n mod 4. Thus x,(p) = (>) when p is 
an odd prime, but x,(7) is defined for all odd numbers n, not just primes. One can 
check that an explicit formula for x, is x,(1) = (—1)"""? although we will not be 
needing this formula. 

In the example with A = —56 the character corresponding to the prime 2 is given 
by the row labeled oF This character associates the value +1 to an odd number 
n = +1 mod 8 and the value —1 when n = +3 mod 8. We will denote it by xg since 
its values xg(n) = +1 depend only on n mod 8. We have xg(p) = (5) for all odd 
primes p, but x(n) is defined for all odd numbers n. There is again an explicit 
formula xg(n) = (-1)"-)’® that we will not use. 

By analogy we can also introduce the notation x, for the earlier character defined 
by x(n) = (=) for p an odd prime and n not divisible by p. 


As another example illustrating the use of characters let us determine which pow- 
ers of 2 are represented by the two forms x* + 15y* and 3x? + 5y? of discriminant 
—60. This is not a fundamental discriminant since itis 4 times the fundamental dis- 
criminant —15, so the conductor is 2 which is why the question of determining the 
forms representing powers of 2 is more subtle, as we saw in the previous section. In 
both the discriminants —15 and —60 we have the characters x3 and x; and we can 
use either one of these for this application so we will use x3. 

First consider discriminant —15 where the class number is 2 corresponding to the 
two forms x° + xy + 4y? and 2x° + Xy + 2y’, The second form represents 2 which 
does not divide the discriminant —15 so all powers of 2 are represented by one or the 
other of these two forms. To determine which form it is for each power we use the 
character x3. This has the value +1 on numbers not divisible by 3 in the topograph 
of x* + Xy + Ay? since 1 is one of these numbers and x3(1) = +1. Similarly x3 has 
the value —1 for the other form 2x* + xy + 2y° since 2 appears in the topograph 
of this form and x3(2) = —1. We have x25) = (—1)* since y2“) = (2) = ay 
Hence x°? + xy + 4y? represents only the even powers of 2 and 2x* + xy + 2y 
represents only the odd powers. 

For discriminant —60 the class number is also 2, corresponding to the forms 
x? + 15y? and 3x? + 5y*. Obviously neither of these forms represents 2 or 4. 
However by Proposition 6.13 each power 2* with k = 3 is represented by at least one 
of the two forms since all powers 2* with k > 1 are represented by one of the forms 
of discriminant —15. The value of x; for x* +157” is +1 since this form represents 
1 and x3(1) = +1, and the value of x3 for 3x? +5y° is —1 since this form represents 
5 and x3(5) = —1. From this it follows as before that x? +15y° represents just the 
even powers of 2 starting with 2* and 3x? + 5y° represents just the odd powers 
starting with 2°. This is the answer that was given in the large table in the preceding 
section. 
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Let us consider now how characters can be associated to the prime 2 in general. 
Since characters arise from primes that divide the discriminant, this means we are 
interested in even discriminants, and the characters we are looking for should assign 
avalue +1 to each number not divisible by 2, that is, to each odd number. We would 
like the analogue of Proposition 6.19 to hold, so characters for the prime 2 should 
take the same value on all odd numbers in the topograph of a form of the given 
discriminant. By Lemma 6.20 this just means that the characters should have the 
same value for odd numbers in adjacent regions of the topographs. 

Even discriminants are multiples of 4 so can be written as A = 46. For adjacent 
regions in a topograph with labels n and k we have A = h? — 4nk where h is the 
label on the edge between the two regions. Since A is even, so is h and we can write 
h = 21. The discriminant equation then becomes 46 = 41° — 4nk or just ô = l° — nk. 

There will turn out to be six different cases. The first two are when ô is odd, which 
means that A is divisible by 4 but not 8. In these two cases we consider congruences 
mod 4, the highest power of 2 dividing A. Since 6 is odd and both n and k are odd, 
the equation 6 = 1° — nk implies that l must be even, so 1° = 0 mod 4 and we have 
nk = -ô mod 4. Multiplying both sides of this congruence by k, we get n = -ôk 
mod 4 since k* = 1 mod 4, k being odd. Multiplying the congruence n = -ôk by k 
again gives the previous congruence nk = —6 so the two congruences are equivalent. 


Case 1: 6 = 4m-1. The congruence condition n = -ôk mod 4 is then n = k mod 4. 
Thus Lemma 6.20 implies that the character x, assigning +1 to integers 4s + 1 and 
—1 to integers 4s — 1 has the same value for all odd numbers in the topograph of 
a form of discriminant A = 4(4m — 1). We might try reversing the values of %4, 
assigning the value +1 to integers 4s — 1 and —1 to integers 4s + 1, but this just 
gives the function —x, which does not really give any new information that x, does 
not give. In practice x, turns out to be more convenient to use than —X, would be. 

An example for the case 6 = 4m — 1 is the discriminant A = —84 considered 
earlier, where the first row of the character table gave the values for x4. 


Case 2: 6 = 4m + 1. The difference from the previous case is that the congruence 
condition is now n = -k mod 4. This means the mod 4 value of odd numbers in the 
topograph is not constant, and so we do not get a character for the prime 2. As an 
example, consider the form x° + 3y° with A = —12 and 6 = -3. 
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Here there are odd numbers in the topograph congruent to both 1 and 3 mod 4. 
The situation is not improved by considering odd numbers mod 8 instead of mod 4 
since the topograph contains numbers congruent to each of 1,3,5,7 mod 8. Trying 
congruences modulo higher powers of 2 does not help either. 

The absence of a character for the prime 2 when 6 = 4m + 1 could perhaps 


have been predicted from the calculation of (5). Since ô is odd we have A = 


4ô =4p,---p, for odd primes p,,---,p, and so (5) = (4) vee (Er). This equals 
(F) e. (F) since the number of p,’s congruent to 3 mod 4 is even when 6 = 


4m + 1. Thus the value of (4) depends only on the characters associated to the odd 


p 
prime factors of A. 


There remain the cases that ô is even. The next two cases are when A is divisible 
by 8 but not by 16. After that is the case that A is divisible by 16 but not by 32, 
and finally the case that A is divisible by 32. In all these cases we will consider 
congruences mod 8, so the equation ô = 1° — nk becomes 6 = 1? — nk mod 8. Since 
ô is now even while n and k are still odd, this congruence implies l is odd, and so 
l° = 1 mod 8 and the congruence can be written as nk = 1 — 6 mod 8. Since k? = 1 
mod 8 when k is odd, we can multiply both sides of the congruence nk = 1 — ô by k 
to obtain the equivalent congruence n = (1 — 6)k mod 8. 


Case 3: 6 = 2 mod 8. The congruence is then n = -k mod 8. It follows that in the 
topograph of a form of discriminant A = 4(8m + 2) either the odd numbers must all 
be congruent to +1 mod 8 or they must all be congruent to +3 mod 8. Thus the 
character xg which takes the value +1 on numbers 8s +1 and —1 on numbers 8s +3 
has a constant value, either +1 or —1, for all odd numbers in the topograph. 

An example for this case is A = 40. Here the two rows of the character table 
computed earlier in this section gave the values for xg and x;. 


Case 4: 6 = 6 mod 8. Now the congruence n = (1 — 6)k mod 8 becomes n = —5k, 
or equivalently n = 3k mod 8. This implies that all odd numbers in the topograph 
of a form of discriminant A = 4(8m + 6) must be congruent to 1 or 3 mod 8, or 
they must all be congruent to 5 or 7 mod 8. The character associated to the prime 
2 in this case has the value +1 on numbers 8s + 1 and 8s + 3, and the value —1 on 
numbers 8s +5 and 8s + 7. We have not encountered this character previously, so 
let us give it the new name xg. However, it is not entirely new since it is actually just 
the product x4Xg as one can easily check by evaluating this product on 1,3,5, and 7. 
A -8 


A simple example is A = —8 with class number 1. Here we have (3) = (>) = 
2 


(>) (5) which equals +1 for p = 1,3 mod 8 and —1 for p = 5,7 mod 8 so this is 


just the character xg. 
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Another example is A = 24 where there are the two forms Q, = x° — 6y* and 
Q, = 6x* — y*. We have (5) = (3) = (5) (ž) = (5) (>) (£). The character table 
has the following form: 


1 5 7 11 13 17 19 23 


Xg | +1 -1 -1 +1 -1 +1 +1 -1 

X3 | +1 -1 +1 -1 +1 -1 +1 -1 
Thus Q, represents primes p = 1,19 mod 24 and Q» represents primes p = 5,23 
mod 24. 


Case 5: 6 = 4 mod 8. Now we have the congruence n = -3k mod 8. Thus in 
the topograph of a form of discriminant A = 4(8m + 4) all odd numbers must be 
congruent to 1 or 5 mod 8, or they must all be congruent to 3 or 7 mod 8. More 
simply, one can say that all odd numbers in the topograph must be congruent to 1 
mod 4 or they must all be congruent to 3 mod 4. Thus we obtain the character x, 
again. 

An example is A = —48 where we have the two forms Q, = x? 4 12y? and 
Q = 3x? + Ay? as well as a Palt of nonprimitive forms Q; = 2x? + 6y? and Q, = 

-3 -1\/3 


4x° +4xy + 4y*. We have (5) = (>) = (>) (>) = (2). This is the character x3. 


We also have the character x, that we just described. Here is the character table: 


1 5 7 11 13 #17 19 23 25 29 31 35 37 41 43 47 


Xa | del: L eed pad eae et OL a, SES ee. ed Se ee Se 
Wee 1) Heb oe aR SE ee a eee el eel eR) ee) ee: af ae Sl 
Qı Q> Qı Q2 Qi Qs Qı Q> 

The columns repeat every four columns since (>) and (£) are determined by the 
value of p mod 12. In contrast with earlier examples, the representability of a prime 
p > 3 in discriminant —48 is determined by one character, x3, and the other character 
x4 serves only to decide which of the forms Q, and Q, achieves the representation. 
The character x, says nothing about the nonprimitive forms Q, and Q, whose values 
are all even. On the other hand, from x3 we can deduce that all values of Q, not 
divisible by 3 must be congruent to 2 mod 3 while for Q, they must be congruent 
to 1 mod 3. This could also have been deduced from applying x3 to the associated 
primitive forms x° + 3y* and x°+xy+y". 


Case 6: ô = 0 mod 8, so A is a multiple of 32. In this case the congruence n = (1—6)k 
mod 8 becomes simply n = k mod 8. Thus all odd numbers in the topograph of a 
form of discriminant A = 32m must lie in the same congruence class mod 8. The two 


characters x, and xg can now both occur independently, as shown in the following 
chart listing their values on the four classes 1,3,5,7 mod 8: 


1 3 5 7 


X4 el -1 +1 -1 
Xg | +1 -1 -1 +1 
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As an example consider the discriminant A = —32. Here there are two primitive 
forms Q} = x° + 8y* and Q, = 3x* + 2xy + 3y? along with one nonprimitive form 
Q, = 2x*+4y*. We have (5) = (=) = (>) 5) with the two factors being the 
two independent characters for the prime 2. The full character table is then just a 


four-fold repetition of the previous shorter table: 


1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 


X4 | +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 +1 -1 
Xg | +1 -1 -1 +1 +1 -1 -1 +1 +1 -1 -1 +1 +1 -1 -1 +1 
Qi Q2 Qi Q2 Qı Q? Qi Q2 


This finishes the analysis of the six cases for characters associated to the prime 2. 
To summarize we have: 


Proposition 6.21. The characters associated to the prime 2 are given in the follow- 
ing table: 


A 4(4m+1) 4(4m+3) 8(4m+1) 8(4m+4+3) 16(2m+1) 32m 
X — X4 Xs X8 = X4X8 X4 X4» X8 


We have now defined a set of characters for each discriminant A, with one char- 
acter for each odd prime dividing A and either zero, one, or two characters for the 
prime 2 when A is even. The character table for discriminant A has one row for each 
of these characters. 

If one restricts attention to fundamental discriminants then the only relevant 
columns in the table in the preceding proposition are the second, third, and fourth 
columns on the right. Thus the characters for the prime 2 that arise in the three cases 
of fundamental discriminants are exactly x4, Xg, and xg. 


A nice property satisfied by characters is that they are multiplicative, so y(mn) = 
Xx(m)x(n) for all m and n for which x is defined. For the characters x, associated to 
odd primes p this is just the basic property (=) = (2) (2) of Legendre symbols. 
For the prime 2 the characters x, and xg are multiplicative as well. For x, this 
holds since x4(1-1) = +1 = x4(1)x4(1), x4(1:-3) = -1 = x4(1)x4(3), and x4(3:3) = 
+1 = xX4(3)x4(3). Similarly for xg we have xg(+1- +1) = +1 = xş(+1)X%ş(+1), 
Xg(+1- +3) = -1 = Xg(+1)XQ(+3), and xg(+3- + 3) = +1 = %ş(+3)xş(+3). The 
multiplicativity of xg follows since xg = X4Xg- 

In fact x4, Xg, and xg are the only multiplicative functions from the odd integers 
mod 8 to {+1}, apart from the trivial function assigning +1 to all four of 1,3,5,7. 
To see this, note first that each of 3,5, 7 has square equal to 1 mod 8 and the product 
of any two of 3,5,7 is the third, mod 8. This means that a multiplicative function x 
from odd integers mod 8 to {+1} is completely determined by the two values x(3) 


and x(5) since x(1) = x(3)x(3) and x(7) = x(3)x(5). For x, the values on 3 and 5 
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are —1,+1, for xg they are —1,-1, and for xg = X4Xg they are +1,—-1. The only 
other possibility is +1,+1 but this leads to the trivial character. 


Another useful observation is that the value of the Legendre symbol (5) is de- 
termined by the characters for discriminant A. From Proposition 6.9 the formulas for 
(5) are given by the second column of the following table, where A = €2°p,--- Pk 
for € = +1 and odd primes p;. The product of characters corresponding to each 


product of Legendre symbols in the second column is given in the third column. 


ams) | BEB | xox x 


For the prime 2 we can compare this table with the one in Proposition 6.21. The 
first four of the six cases in the earlier table are included in the four cases here, so 
characters determine (5) in these cases. When A = 16(2m + 1) in the previous table 


P 
we are in one of the first two cases here, so we have x, available to determine the 
value of (5). When A = 32m in the earlier table both x, and xg are available so 


(5) is again determined by the characters for discriminant A since Xg = X4Xg. 


Let us denote the product of characters in the last column of the table above 
by X, so (5) = X, (p). The value X, (n) = +1 is defined whenever n is coprime 
to A. If Xa (n) = +1 it need not be true that n is represented in discriminant A when 
n is not a prime. For example for A = —4 we have X, (21) = xX4(21) = X4(3)x4(7) = 
(—1)(-1) = +1 but 21 is not represented by the form x° + y°, the only form in 
this discriminant up to equivalence. However it is always true that X,(n) = +1 if 
n is represented in discriminant A since in this case each prime factor p of n is 
represented, hence X,(p) = +1, and X,(n) is the product of these terms X,(p) 
since X, is multiplicative, being a product of multiplicative functions. 


Next let us verify that some of the special features of the character tables in the 
earlier examples hold in general. 


Proposition 6.22. (a) The columns of a character table contain all possible combi- 
nations of +1 and —1, and each such combination occurs in the same number of 
columns. 

(b) If the discriminant A is not a square then half of the columns have X,(n) = +1 
and half have X,(n) = -1 for numbers n in the congruence class corresponding 
to the column. 


190 Chapter 6 — Representations by Quadratic Forms 


For example, if A is a fundamental discriminant then X, is just the product of all 
the characters in the character table, so the combinations of +1’s that give X, = +1 
in these cases are the combinations with an even number of —1’s. This need not be 
true for nonfundamental discriminants as the earlier example A = —48 shows. 

From statement (b) in the proposition we immediately deduce: 


Corollary 6.23. For hyperbolic and elliptic forms, the primes not dividing the dis- 
criminant A that are represented in discriminant A are the primes in exactly half 
of the congruence classes mod A of numbers coprime to A. 


For the proof of Proposition 6.22 we will need the following fact: 


r-l 


Lemma 6.24. For a power p” of an odd prime p exactly half of the p” — p 
congruence classes mod p” of numbers a not divisible by p satisfy (5) = +1. 


Proof: First we do the case r = 1. The p — 1 nonzero congruence classes mod p are 
+1,+2,---,+¥%(p-—1). The two numbers +a and -a in each pair +a have the same 
square, so there are at most 1/>(p—1) different nonzero squares mod p. In fact there 
are exactly this many since if a? = b* mod p then p divides a° -b° = (a—b)(a+b), 
so since p is prime it must divide either a—b or a+b which means that either a = b 
or a = —b mod p. Thus exactly half of the p — 1 nonzero congruence classes mod p 
are squares, so the lemma is proved when r = 1. 

Now suppose r > 1. The value of (4) depends only on the congruence class of 
a mod p so there are the same number of numbers a with (4) = +1 in each of the 
intervals [0, p], [p,2p], [2p, 3p], etc. There are p’! 
Thus half of the p~! (p — 1) = p” — p”* congruence classes mod p” of numbers a 


not divisible by p have (3) = +1 and half have (3) = -]1. o 


of these intervals in [0, p”]. 


Proof of Proposition 6.22: Let us write A = €2°pj'-- pe where € = +1, s = 0, 
and the p;’s are the distinct odd prime divisors of A. Thus the characters for this 
discriminant are Xp,» '**,Xp, and either zero, one, or two more characters associated 
to the prime 2 when s > 0. 

To prove statement (a) choose numbers a, realizing any combination of preas- 
signed values x,,(a;) = +1. When s > 0 we also choose a number 1, 3, 5, or 7 to 
realize any preassigned pair of values for x, and xg, hence for any preassigned val- 
ues for the characters associated to the prime 2. By the Chinese Remainder Theorem 
there is anumber a congruent to each a; mod pi and to the chosen number 1,3,5,7 
mod 8. The number a is coprime to A since it is nonzero mod p; for each i and is 
odd when s > 0. Thus the column in the character table corresponding to a realizes 
the chosen values for all the characters. 

To prove the second half of statement (a) we will count the number of columns 
in the character table realizing a given combination of values +1 and see that this 
number does not depend on which combination is chosen. By the preceding lemma 
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the number of choices for a; mod pi in the previous paragraph is 1⁄2 pi (p;—1),so 
the Chinese Remainder Theorem implies that when s = 0 the number of congruence 
classes mod A realizing a given combination of values +1 is the product of these 
numbers 1/, pi (p; — 1). When s > 0 but there is no character for the prime 2, 
the product of the numbers 1⁄ pi (p; — 1) is multiplied by 2°"! since this is the 
number of odd congruence classes mod 2°. If there is one character for the prime 2 
the number 2°~! is cut in half, and if there are two characters for the prime 2 it is cut 
in half again. Thus in all cases the number of columns realizing a given combination 
of +1’s is independent of the combination. 

For (b), consider the definition of X, which has four different cases depending 
on the prime factorization of A. If A is a square then the applicable formula is the 
first of the four formulas since an odd square is 1 mod 4, and in fact the formula 
degenerates to just the constant +1 since its terms all cancel out, as each prime factor 
of A occurs to an even power. When A is not a square, the terms in the first of the 
four formulas do not all cancel out, and in the other three formulas there is also at 
least one term remaining after cancellations, either x4, Xg, OF Xg- 

In view of property (a), to prove (b) it will suffice to show that when A is not 
a square, the set of combinations of values +1 in columns of the character table 
that give X, = +1 has the same number of elements as the set of combinations 
that give Xa = —1. But this is obviously true since we can interchange these two 
sets by choosing one term in the formula for X, that remains after cancellation and 
switching the sign of the value +1 for this term, keeping the values for the other 
characters unchanged. o 


Recall the concept of genus that was introduced informally in Section 6.1. The 
idea was that if two forms of the same discriminant cannot be distinguished by looking 
only at their values modulo the discriminant then they should be regarded as having 
the same genus. Here it is best to restrict attention just to primitive forms. We can 
now give this notion a more precise definition by saying that two primitive forms of 
discriminant A have the same genus if each character for discriminant A takes the 
same value on the two forms, where the value of a character on a form means its value 
on all numbers in the topograph not divisible by the prime associated to the character. 

In fact there is always a single number in the topograph that can be used to 
evaluate all the characters, according to the following general result: 


Proposition 6.25. Given a positive integer n anda primitive form Q that represents 
at least one positive number, then Q represents a positive number coprime to n. 


For the application to evaluating characters we choose n = |A| for A the discrim- 
inant of Q (which we assume is nonzero to avoid trivialities). 


Proof: Let Q = ax? + bxy +cy°. We can replace Q by any equivalent form so we 
can arrange that a > 0 and c > 0 by choosing two adjacent regions in the topograph 
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of Q with positive labels a and c. We can also assume b > 0 since changing the sign 
of b produces an equivalent form. 

The case n = 1 is trivial since every positive number is coprime to 1, so we may 
assume n > 1. Suppose first that n is a prime p. One of the following three cases 
will apply: 


(1) If p does not divide a let (x,y) be a primitive pair with p dividing y but 
not x. Then p will not divide ax? + bxy + cy*. For example we could take 
(x,y) = (1,p). 

(2) If p divides a but not c let (x,y) be a primitive pair with p dividing x but 
not y. Then p will not divide ax? + bxy + cy*. For example we could take 
(x,y) = (p,1). 

(3) If p divides both a and c then it will not divide b since Q is primitive. In this 
case let (x,y) be a primitive pair with neither x nor y divisible by p. Then p 
will not divide ax? + bxy + cy*. For example we could take (x, y) = (1,1). 


This finishes the proof when n is prime. For a general n let p4, -, pp be its distinct 
prime divisors. For each p; let (x;, y;) be (1, p;), (p; 1), or (1,1) according to which 
of the three cases above applies to p;. Now let x = x,---x, and y = Vi: Yk. 
Then x and y are coprime since no p; is a factor of both x and y. If the number 
ax? +bxy +cy° is not coprime to n it will be divisible by some pi. If case (1) applies 
to p; then p; divides y but not x so p; does not divide ax? +bxy +cy°. Likewise 
if cases (2) or (3) apply to p; then p; does not divide ax? + bxy + cy’. Thus no p; 
can divide ax? + bxy +cy°. Finally, ax*+bxy+cy? is positive since x and y are 
positive as are the coefficients except possibly b which is either positive or zero. oO 


The number of genera in discriminant A is at most 2“ where « is the number of 
characters in discriminant A. In all the character tables we have looked at, only half 
of the 2“ possible combinations of +1’s were actually realized by forms, and in fact 
this is true generally: 


Theorem 6.26. If A is not a square then the number of genera of primitive forms 
of discriminant A is 2⁄7} where x is the number of characters in discriminant A. 


This turns out to be fairly hard to prove. The original proof by Gauss required 
a somewhat lengthy digression into the theory of quadratic forms in three variables. 
An exposition of this proof can be found in the book by Flath listed in the Bibliogra- 
phy. We will give a different proof that deduces the result rather quickly from things 
we have already done, together with Dirichlet’s Theorem about primes in arithmetic 
progressions discussed at the end of Section 6.1, which we will not prove. We will 
not need the full strength of Dirichlet’s Theorem, and in fact all we will actually need 
is that each congruence class of numbers x = b mod a contains at least one prime 
greater than 2 if a and b are coprime. One might think this would be easier to prove 
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than that there are infinitely many primes in the congruence class, but this seems not 
to be the case. 


Proof of Theorem 6.26 using Dirichlet’s Theorem: We have seen that for each prim- 
itive form Q of discriminant A there is a number n coprime to A that is represented 
by Q. Then X,(n) is defined, and we saw when we defined X, that X,(n) = +1 
when n is represented by a form of discriminant A. In the proof of Proposition 6.22 
we showed that exactly half of the 2“ possible combinations of +1’s have X, = +1, 
so the number of genera of forms is at most 2*~!. 

To show that the number of genera is at least 2“! consider a combination of +1’s 
with X, = +1. By Proposition 6.22 this combination occurs in some column of the 
character table. This column corresponds to some number n coprime to A. By Dirich- 
let’s Theorem there exists a prime p congruent to n mod A. We have X,(p) = +1, s0 
since p is prime this implies that p is represented by some form of discriminant A. 
This form must be primitive, otherwise every number it represents would be divisible 
by some number d > 1 dividing A so it could not represent p which is coprime to A. 
Thus every combination of +1’s with X, = +1 is realized by some primitive form, so 
the number of genera is at least 2“~!. o 


From this theorem we can deduce two very strong corollaries. 


Corollary 6.27. The number of genera in discriminant A is equal to the number 
of equivalence classes of primitive forms of discriminant A that have mirror sym- 
metry. 


This may seem a little surprising since there is no apparent connection between 
genera and mirror symmetry. A possible explanation might be that each genus con- 
tains exactly one equivalence class of primitive forms with mirror symmetry, but this 
is not always true. For example when A = —56 we saw in Section 6.1 that there are 
two genera and two equivalence classes of mirror symmetric forms, but both these 
forms belong to the same genus. The true explanation will come in Chapter 7 when 
we study the class group. 


Proof: The number of equivalence classes of primitive forms with mirror symmetry 


2k-1 in most cases, where k is the number of 


was computed in Theorem 5.9 to be 
distinct prime divisors of A. The exceptions are discriminants A = 4(4m +1) when 
2*-! is replaced by 2'~?, and A = 32m when 2*~! is replaced by 2*. In the nonexcep- 
tional cases we have k = x, the number of characters in discriminant A since there is 
one character for each prime dividing A. When A = 4(4m + 1) there is no character 
for the prime 2 so K = k — 1, and when A = 32m there are two characters for the 


prime 2 so k = k + 1. The result follows. o 
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Corollary 6.28. For a fixed discriminant A, each genus of primitive forms consists 
of a single equivalence class of forms if and only if all the topographs of primitive 
forms of discriminant A have mirror symmetry. 


Proof: Let E(A) be the set of equivalence classes of primitive forms of discriminant 
A and let G(A) be the set of genera of primitive forms of discriminant A. There 
is a natural function ®:E(A) — G(A) assigning to each equivalence class of forms 
the genus of these forms. The function ® is onto since there is at least one form in 
each genus, by the definition of genus. If all primitive forms of discriminant A have 
mirror symmetry then Corollary 6.27 says that the sets E(A) and G(A) have the same 
number of elements. Then since ® is onto it must also be one-to-one. This means 
that each genus consists of a single equivalence class of forms. 

Conversely, if each genus consists of a single equivalence class then ® is one- 
to-one. Since ® is also onto, this means it is a one-to-one correspondence so E(A) 
and G(A) have the same number of elements. By Corollary 6.27 this means that 
the equivalence classes of primitive forms with mirror symmetry account for all the 
elements of E(A), and the proof is complete. o 


Exercises 


1. For the following discriminants determine the class number and a form in each 
class, then use a character table to determine which primes are represented by each 
of the forms, at least to the extent that this can be determined by characters. Also 
determine the various genera. 

(a) -24 (b) 24 (c) —39 (d) -96 

2. Determine which primes are represented by each of the following forms: 

(a) x? + 8y? (b) x? + 9y? (c) x7 + 25y? (d) x? —12y* and 12x* - y? 


3. Show that each genus consists of a single equivalence class of forms for the fol- 
lowing discriminants: 
(a) -168 (b) -660 (c) 105 


4. Find the smallest positive discriminant for which the number of generais 16. How 
does the answer change if only fundamental discriminants are allowed? 


5. Show that for a positive nonsquare discriminant A, if the principal form represents 
-1 then all odd primes p dividing A must satisfy p = 1 mod 4. Hint: Use x,. 


6. Use Propositions 6.1 and 6.25 to show that in each nonzero discriminant there 
exists a form that represents an infinite number of primes. 
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6.4 Proof of Quadratic Reciprocity 


First let us show that quadratic reciprocity can be expressed more concisely as a 


(£)(4) =p’ 


Here p and q are distinct odd primes. Since they are odd, the fractions 2+ and 4+ 
are integers. The only way the exponent pot . a71 can be odd is for both factors to 


be odd, so £5 = 2k + 1 and 4 = 2l + 1, which is equivalent to saying p = 4k + 3 


single formula: 


and q = 4l + 3. Thus the only time that the right side of the formula shown above is 
—1 is when p and q are both congruent to 3 mod 4, and quadratic reciprocity is the 
assertion that the left side of the formula has exactly this property. 

There will be three main steps in the proof of quadratic reciprocity. The first is 
to derive an explicit algebraic formula for (3) due originally to Euler. The second 
step is to use this formula to give a somewhat more geometric interpretation of (3) 
in terms of the number of dots in a certain triangular pattern. Then the third step is 
the actual proof of quadratic reciprocity using symmetry properties of the patterns 
of dots. This proof is due to Eisenstein, first published in 1844, simplifying an earlier 
proof by Gauss who was the first to give a full proof of quadratic reciprocity. 


Step 1. In what follows we will always use p to denote an odd prime, and the symbol 
a will always denote an arbitrary nonzero integer not divisible by p. When we write 
a congruence such as a = b this will always mean congruence mod p, even if we do 
not explicitly say that the modulus is p. 

Euler’s formula is 


-1 
(4) = es mod p 


For example, for p = 11 Euler’s formula says (+) = 2°” = 32 = —1 mod 11 and 
(2) = 3° = 243 = +1 mod 11. These are the correct values since the squares mod 
11 are (+1)* = 1, (+2)? = 4, (+3)? =9, (+4)* =5, and (+5)* =3. 

Euler’s formula determines the value of (4) uniquely since +1 and —1 are not 
congruent mod p since p > 2. It is not immediately obvious that the number a 2— 
should always be congruent to either +1 or —1 mod p, but when we prove Euler’s 
formula we will see that this has to be true. 


As a special case, taking a = —1 in Euler’s formula gives the calculation of (+) 


p 
-1 p1 +1 ifp=4k+1 
== e a = 
( ) oh i if p = 4k +3 
Before proving Euler’s formula we will need to derive a few preliminary facts 
about congruences modulo a prime p. First let us note that each of the numbers 
a =1,2,---,p-—1 has a multiplicative inverse mod p. This is a special case of the 
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fact that each number coprime to a number n has a multiplicative inverse mod n as 
we saw in Section 2.3. (This was because the equation ax + ny = 1 has an integer 
solution (x,y) whenever a and n are coprime.) Any two choices for an inverse to 
a mod p are congruent mod p since if ax = 1 and ax’ = 1 then multiplying both 
sides of ax’ = 1 by x gives xax’ = x, and xa = 1 so we conclude that x = x’. 

Which numbers equal their own inverse mod p? If a-a = 1, then we can rewrite 
this as a° — 1 = 0, or equivalently (a + 1)(a — 1) = 0. This is certainly a valid con- 
gruence if a = +1, so suppose that a # +1. The factor a + 1 is then not congruent 
to 0 mod p so it has a multiplicative inverse mod p, and if we multiply the congru- 
ence (a + 1)(a— 1) = 0 by this inverse, we get a — 1 = 0 so a = 1, contradicting 
the assumption that a # +1. This argument shows that the only numbers among 
1,2,---,p-—1 that are congruent to their inverses mod p are 1 and p- 1. 

An application of this fact is a result known as Wilson’s Theorem: 


(p-1)!=-1 modp whenever p is prime. 


To see why this is true, observe that in the product (p — 1)! = (1)(2)---(p-—1) each 
factor other than 1 and p — 1 can be paired with its multiplicative inverse mod p and 
these two terms multiply together to give 1 mod p, so the whole product is congruent 
to just (1)(p — 1) mod p. Since p — 1 = -1 mod p this gives Wilson’s Theorem. 
Now let us prove the following congruence known as Fermat’s Little Theorem: 


a’-'=1 modp whenever p is an odd prime not dividing a. 


To see this, note first that the numbers a, 2a,3a,---,(p—1)a are all distinct mod p 
since we know that a has a multiplicative inverse mod p , so in a congruence ma = na 
we can multiply both sides by the inverse of a to deduce that m = n. Let us call this 
property that ma = na implies m = n the cancellation property for congruences 
mod p. 

It follows from the cancellation property that the set {a,2a,3a,---,(p — 1)a} 
is the same mod p as {1,2,3,---,p — 1} since both sets have p — 1 elements and 
neither set contains numbers that are 0 mod p. (If ma = 0 then multiplying by the 
inverse of a gives m = 0.) If we take the product of all the numbers in each of these 
two sets we obtain the following congruence: 


(a)(2a)(3a)---(p—1)a = (1)(2)(3)-+-(p-—1) mod p 


We can cancel the factors 2,3,---,-—1 from both sides by repeated applications of 
the cancellation property. The result is the congruence a’”~! = 1 claimed by Fermat’s 
Little Theorem. 


noe we can prove Euler’s formula for ($ a), The first case is that 4) = = +1. Then 
p-1 

a ae for some x #0 anda 2 =x? so by Fermat’s Little Theorem we have 

a 2 =1. Thus Euler’s formula (4) = =a E is valid in this case since both sides 


p 
are +1. 
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The other case is that (3) = —1 so a is not a square mod p. Observe first that 


the congruence xy = a has a solution y mod p for each x # O since x has an 


inverse x! 


mod p so we can take y = x 'a. Moreover the solution y is unique 
mod p since xy; = xy implies yı = y, by the cancellation property. Since we 
are in the case that a is not a square mod p the solution y of xy = a satisfies 
y # x. Thus the numbers 1,2,3,---,p — 1 are divided up into Y>(p — 1) pairs 
{X1, V1}, (Xo, Vo}, 0t, (Xp, Ve} with x;y; = a for each i. Multiplying all these 


1/,(p — 1) pairs together, we get: 
p-l 
a2= X1 Y1 X2 V2 ` Xp- Vpn 


The product on the right is just a rearrangement of (1)(2)(3) -- - (p —1), and Wilson’s 
Theorem says that this product is congruent to —1 mod p. Thus we see that Euler’s 
a 


= 
formula (4) = as holds also when (3) = —1, completing the proof in both cases. 


A consequence of Euler’s formula is the multiplicative property of Legendre sym- 
bols that we stated and used earlier in the chapter: 


— — —_— —— 


This holds since (ab) 


Step 2. Our goal here will be to express the Legendre symbol (3) in more geometric 
terms. To begin, consider a rectangle in the first quadrant of the x y-plane that is p 
units wide and a units high, with one corner at the origin and the opposite corner at 
the point (p,a). The picture at the right shows 
the case (p,a) = (7,5). We will be interested 


in points that lie strictly in the interior of the 
rectangle and whose coordinates are integers. 
Points satisfying the latter condition are called 
lattice points. The number of lattice points in 
the interior is then (p —1)(a—1) since their x- 


coordinates can range from 1 to p—1 and their 
y-coordinates from 1 to a—1, independently. 


The diagonal of the rectangle from (0,0) to (p,a) does not pass through any of 
these interior lattice points since we assume that the prime p does not divide a, so 
the fraction “/p, which is the slope of the diagonal, is in lowest terms. (If there were 
an interior lattice point on the diagonal, the slope of the diagonal would be a fraction 
with numerator and denominator smaller than a and p.) Since there are no interior 
lattice points on the diagonal, exactly half of the lattice points inside the rectangle 
lie on each side of the diagonal, so the number of lattice points below the diagonal is 
(p — 1)(a — 1). This is an integer since p is odd, which makes p — 1 even. 


198 Chapter 6 — Representations by Quadratic Forms 


A more refined question one can ask is how many lattice points below the diagonal 
have even x-coordinate and how many have odd x-coordinate. Here there is no 
guarantee that these two numbers must be equal, and indeed if they were equal then 
both numbers would have to be 14(p — 1)(a — 1) but this fraction need not be an 
integer, for example when p = 7 anda = 4. 


We denote the number of lattice points that are below the diagonal and have even 
x-coordinate by the letter e. Here is a figure showing the values of e when p = 7 and 
a ranges from 1 to 6: 


a 
| a|e|(>) 
6 | 9| -1 
Q (0) O O (0) O D 5 
5 zZ =I 
D fe) fe) fe) fe) O e D 4 
4 | 6] +1 
p o o o e fo) ° D 3 
3 ihiege lp ak 
Q fe) (0) © e O e D 2 
2 |2| +1 
0) (0) © O e O e D 1 
To) Qe |) “Bd 


A slightly more complicated example with p = 13 and a varying from 1 to 12 is 
shown below. 


a 

fax} 
~ 
a 
— 


q © o—eoe—o— eoo o o o 9 © 12 13 

12 |36| +1 
p o o o o O O O O 0 O Oo 11 

11 |33 -1 
b o o o o o o o O O O Á 10 

10/30 +1 
D (0) (0) (0) fo) (0) fo) (0) fo) (0) e (0) e >» 9 
® (0) fo) (0) (0) (0) (0) (0) fe) (0) e (0) ° » 8 9 26 +1 
b o o o o 0 O 0/fe ¢g o e » 7 8 |23| -1 
D o 0 0 O 0 0O Q O A O e 6 7 | 21 al 
D (0) [0] [2] (0) O Ọ Ọ e O e D 5 6 15 -1 
® fe) (0) (0) (0) Q fo) e Ó e © e 4 5 13 -1 
b o o ọ Seo o o eo » 3 4/10) +1 
D re) 9 O e Q e O e O e (0) e 2 3 6 +1 
D 2 O ro) ©; e (0) e (0) 1 2 -l 
P T, © o © © 9 o © =, oS © Ə zá 1 O +] 
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The way that e varies with a seems somewhat unpredictable. What we will show 
is that just knowing the parity of e is already enough to determine the value of the 
Legendre symbol via the following simple formula: 


ORES 


To prove this we first derive a formula for e. The segment of the vertical line x = u 
between the x-axis and the diagonal has length u -9/p = “%/p since the slope of the 
diagonal is “/,. If u is a positive integer, the number of lattice points on this line 
segment is |“4/,|, the greatest integer n < “4/,. If we add up these numbers of 


lattice points for u running through the set of even numbers E = {2,4,---,p—1} 
we get: 
e= 5 [447| 
E 


The way to compute |*4/p] is to apply the division algorithm for integers, dividing 
p into ua to obtain |“4/,| as the quotient with a remainder that we denote r(u). 
Thus we have: 


ua = p|“4/,|+7r(u) (1) 


The formula ua = p|“4/,| + r(w) implies that |“4/,| has the same parity as r (u) 
since u is even and p is odd. Hence X pl | has the same parity as >’; r(u). Since 
e = X p|“4/p], this implies that the number (—1)© that we are interested in can be 
computed as: 


(-1)® = (-1)2#7™ (2) 


With this last expression in mind we will focus our attention on the remainders r (u). 

The number r(w) lies strictly between 0 and p and can be either even or odd, 
but in both cases we can say that (-1)"™ r (u) is congruent to an even number in 
the interval (0, p) since if v (u) is odd, so is (-1)"™r(u) and then adding p to this 
gives an even number between 0 and p. Thus there is always an even number s(u) 
between 1 and p that is congruent to (-1)" r(u) mod p. Obviously s(u) is unique 
since no two numbers in the interval (0, p) are congruent mod p. 

A key fact about these even numbers s(u) is that they are all distinct as u varies 
over the set E. For suppose we have s(u) = s(v) for another even number v in E. 
Thus r(u) = +r(v) mod p, which implies au = +av mod p in view of the equa- 
tion (1) above. We can cancel the a from both sides of the congruence au = +av to 
get u = +v. However we cannot have u = —v because the number between 0 and p 
that is congruent to —v is p — v, so we would have u = p — v which is impossible 
since u and v are even while p is odd. Thus we must have u = +v, hence u = v 
since these are numbers strictly between 0 and p. This shows that the numbers s(w) 
are all distinct. 
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r(u) 


Now consider the product of all the numbers (—1)"*"’r(u) as u ranges over the 


set E. Written out, this is: 
[errol enra] [pp Pr(p - 1) (3) 
By equation (1) we have r(u) = ua mod p, so this product is congruent mod p to: 
[(-1)" 2a] [(-1)" 4a] ---[(-)"® p- Da] 


On the other hand, by the definition of the numbers s(u) the product (3) is congruent 
mod p to [s(2)][s(4)]---[s(y—1)]. There are !/,(p—1) factors here and they are all 
distinct even numbers in the interval (0, p) as we showed in the previous paragraph, 
so they are just a rearrangement of the numbers 2,4,---,p — 1. Thus we have the 
following congruence: 


[cnr 2a] | (-p" 4a] --- [ODP (p - Da] = (2)(4)---(p- mod p 
Canceling the factors 2,4,---,—1 from both sides of this congruence gives: 
=Í 
(—1) 21) go =1 modp 


p-l 
Both the factors (—1)~£"™ and a 2 are +1 mod p and their product is 1 so they 
must be equal mod p (using the fact that 1 and —1 are not congruent modulo an 


p-l 
odd prime). By Euler’s formula we have a 2 = (5) mod p, so from the earlier 


formula (2) we conclude that (4) = (-1)®. This finishes Step 2. 


Step 3. Now we specialize the value of a to be an odd prime q distinct from p. As 
in Step 2 we consider lattice points in the interior of a p x q rectangle. 


O O O O O 0 0 O 
O O O O O O O O 


O O O O O O 0 O Ọ 


[e] [e] je\o o o o o o 


FPN V A a QA NGO 

O O O O O O O ONO O 
O O O O O O O O O XO 
O—O—_9—_9—_9—__0—_9—_6—__8—_-8 


3 5 8 9 10 13 15 16 17 
From Step 2 we know that (5) = (-1)© where e is the number of lattice points 


with even x-coordinate inside the rectangle and below the diagonal. Suppose that 
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we divide the rectangle into two equal halves separated by the vertical line x = P/ 
which does not pass through any lattice points since p is odd. This vertical line cuts 
off two smaller triangles from the two large triangles above and below the diagonal of 
the rectangle. Call the lower small triangle L and the upper one U, and let l and u 
denote the number of lattice points with even x-coordinate in the interiors of L and 
U respectively. Note that u has the same parity as the number of lattice points with 
even x-coordinate in the interior of the quadrilateral below U in the right half of the 
rectangle since each column of lattice points inside the rectangle has q — 1 points, an 
even number. Thus e has the same parity as l+ u, hence (—1)° = (lr 

The next thing to notice is that rotating the triangle U by 180 degrees about the 
center of the rectangle carries it onto the triangle L. This rotation takes the lattice 
points inside U with even x-coordinate onto the lattice points inside L with odd x- 
coordinate. Thus we obtain the formula (5) = (-1)' where t is the total number of 
lattice points inside the triangle L. 

Reversing the roles of p and q, we can also say that (2) = (-1)" where t’ is 
the number of lattice points inside the triangle L’ with edges on the diagonal of the 
rectangle, the horizontal line y = 4/5, and the y-axis. Then t + t’ is the number of 
lattice points in the interior of the small rectangle formed by L and L’ together. This 
number is just pot : T, Thus we have 

4\(P ’ t+t’ polya- l 
(4) (2) = enten" = cpt =p" 
which finally finishes the proof of quadratic reciprocity. o 


We can also use the geometric interpretation of (4) to prove the formula for (5) 


that was given in Section 6.2, namely: 
(2) : fo if p =8k+1 
p) l-1 ifp=8k+3 
2 


We have shown that (5) = (—1)© where e is the number of lattice points inside a 


p x 2 rectangle lying below the diagonal and having even x-coordinate, as indicated 
in the following figure which shows the diagonals for p = 3,5,7,---,17: 


Another way to describe e is to say that it is equal to the number of even integers 
in the interval from P/ to p. We do not need to assume that p is prime in order 
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to count these points below the diagonals, just that p is odd. One can see what the 
pattern is just by looking at the figure: Each time p increases by 2 there is one more 
even number at the right end of the interval (?/,p), and there may or may not be 
one fewer even number at the left end of the interval, depending on whether p is 
increasing from 4k — 1 to 4k + 1 or from 4k +1 to 4k + 3. It follows that the parity 
of e depends only on the value of p mod 8 as in the table for p < 17, so e is even 
for p = +1 mod 8 and e is odd for p = +3 mod 8. 


Exercises 


1. As a sort of converse to Wilson’s Theorem, show that if n is not a prime then 
(n — 1)! is not congruent to -1 mod n. More precisely, when n > 4 and n is not 
prime, show that n divides (n — 1)!, so (n— 1)! = 0 mod n. What happens when 
n=4? 

2. In Step 2 of the proof of quadratic reciprocity there were figures depicting the 


geometric interpretation of (4) and (4). Draw analogous figures for (4) and (4). 


3. Show that the calculation of the Legendre symbol >) can also be obtained using 
the method in the proof of quadratic reciprocity involving counting certain lattice 
points ina (p — 1) x p rectangle. 
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lass Group 
tuadratic Forms 


In the previous chapter we obtained an answer to the question of which numbers 
n are represented by at least one form of a given discriminant, where by “represent” 
we mean “appear in the topograph”, so we consider only the values Q(x, y) for prim- 
itive pairs (x,y). The answer was in terms of certain congruence conditions on the 
prime divisors of n. We could also determine the genus of the forms representing 
n via congruence conditions. What one would really like to do is refine these results 
to determine which equivalence classes of forms represent n, and for this it is natu- 
ral to consider only primitive forms. Determining which primes each primitive form 
represents is a difficult and subtle problem about which much is known, but it re- 
quires considerably deeper mathematics than we can cover in this book so we will say 
nothing more about this beyond what we have already discussed concerning genus. 
Instead, what we will do in the present chapter is study the question for nonprimes, 
assuming one already knows which primes each form represents. For fundamental 
discriminants we will obtain a fairly complete picture, while for nonfundamental dis- 
criminants there will remain certain ambiguities, with examples showing the extra 
complication in these cases. 

The main tool will be a method for multiplying forms of a given discriminant 
that corresponds to multiplying the numbers represented by these forms. This mul- 
tiplication of forms gives rise to a commutative group structure on the set of proper 
equivalence classes of primitive forms of a given discriminant. This group, called 
the class group and denoted CG(A) for discriminant A, also has other uses besides 
determining the forms representing nonprimes. For example we will use it to give 
a good explanation for why the number of genera in a given discriminant is equal 
to the number of equivalence classes of primitive forms in that discriminant whose 
topographs have mirror symmetry. 

In this chapter we will restrict attention entirely to forms of nonsquare discrimi- 
nant, which means elliptic and hyperbolic forms. For elliptic forms we only consider 
those with positive values, as usual. 
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7.1 Multiplication of Forms 


Since we will often be dealing with several different forms at a time it will be 
convenient to shorten the notation by writing a form ax* + bxy + cy? simply as 
La, b,c], retaining only the essential information of the coefficients. We are restricting 
attention to discriminants that are not squares so the outer coefficients a and c must 
always be nonzero. 

Recall that a number a is represented by a form Q if and only if a appears 
in the topograph of Q, and this in turn is equivalent to a appearing as the leading 
coefficient of a form [a,b,c] equivalent to Q. A simple observation is that if a factors 
as a = aa then the forms [a,@>,b,c], [a,,b,a.c], and [a>,b,a,c] all have the 
same discriminant. This shows that if a number a is represented in discriminant A 
then so is each divisor of a, as we saw in Proposition 6.1. 

A form [a;a;, b,c] can thus be split into two forms [a;, b, apc] and [a,,b,a,c] 
of the same discriminant. One might wonder about the reverse process of combin- 
ing or “multiplying” the two forms [a,,b,a jc] and [a5,b,a,c] to obtain the form 
[a,ax,b,c]. For example the product of [2,0,15] and [3,0,10] would be [6,0,5]. 
The main goal in this section will be to show that this simple way to multiply certain 
special pairs of forms is nevertheless sufficiently general to give a well-defined mul- 
tiplication operation on the set of proper equivalence classes of primitive forms of a 
given discriminant. 

A pair of forms [a,,b,a»c] and [a,b,a,c] is said to be concordant. For two 
forms to be concordant is obviously a very strong condition since not only are the 
second coefficients of the two forms equal, but also the first coefficient of each form 
divides the third coefficient of the other form. Furthermore, the discriminants of the 
two forms are equal. Conversely, suppose that two forms [a,,b,c,] and [a>,b,c>] 
with the same middle coefficient have the same discriminant. Then a,c, = aC, 
so if a, divides cy, say cy = a,c for some integer c, then a,c; = AC) = A 4C 
so in particular a,c, = a,a,c, and since a, is nonzero we can cancel it from this 
equation to get c} = aoc. The two forms are thus [a,,b,a»c] and [a»,b,a,c] so 
they are concordant. This argument shows in fact that for two forms [a,,b,c,] and 
Laz, b, c>] of the same discriminant, if a, divides c, then it automatically follows that 
a> divides c,. 

Since we wish to consider only primitive forms the following result will be useful: 
Lemma 7.1. If the concordant forms [a,,b,a,c] and [a,b, a,c] are primitive then 


so is their product [a,az,b,c]. If a, and a, are coprime then the converse is also 
true: If [a,a>,b,c] is primitive then so are [a,,b,a»c]| and [a>,b,a,c]. 
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An extra condition is needed in the converse since for example the primitive form 
[4,0,1] factors as the product of the nonprimitive concordant forms [2,0,2] and 
[2,0,2]. 

Proof: If the coefficients of [a] a>, b,c] have a common divisor then they have a com- 
mon prime divisor, which will divide either a, or az, as well as b and c, so one of 
the forms [a,,b,a 2c] and [a;, b, ac] will not be primitive. This gives the first state- 
ment. For the second, if one of [a;,b,apc] and [a;,b,a;c] is not primitive, say 
[a,,b,a>c], then its coefficients will be divisible by some prime p. If a, and a, are 
coprime, then p dividing a, and ac implies that p divides c. Thus p divides all 
three coefficients of [a, a>, b,c], making it nonprimitive. Oo 


Proposition 7.2. For each pair of primitive forms Q, and Q, of discriminant A 
there is a pair of primitive forms Q; = [a,,b,a.c] and Q5 = [a»,b,a,c] which are 
concordant to each other and properly equivalent to Q, and Q, respectively. The 
forms Q; and Q; can be chosen so that a, > 0 and a, > 0. 


For the proof we will need the following result which will be useful on other 
occasions as well: 


Lemma 7.3. For each pair of forms Q, = [a;,b,,c,] and Q, = [a,b,c] of the 
same discriminant with a, and a, coprime there exists a pair of forms [a,,b, asc] 
and [a,,b,a,c] that are concordant to each other and properly equivalent to Q, 
and Q, respectively. 


Proof: The main step will be to find two forms properly equivalent to Q, and Q, that 
have the same first coefficients as Q,; and Q, and have equal second coefficients. To 
do this we begin by recalling that the edges in the topograph of a form have integer 
labels, with the sign of a label changing when the orientation of the edge is reversed. 
For a region in the topograph of Q, labeled a, let us orient the edges bordering this 
region all in the same direction so that the region lies to the left as we move along 
the edges in the direction specified by their orientation. The edge labels then form 
an arithmetic progression with increment 2a,. One of these edges is labeled b,, so 
the other edge labels are b} + 2a,m for m varying over all integers. Similarly, in 
the topograph of Q, we have a region labeled a, whose bordering edges have labels 
b, + 2an for all integers n. 

We would like one of the edge labels b} + 2a,m to equal one of the edge labels 
b,+2a,n. This means we would like to find integers m and n satisfying the equation 
b; +2a,m = b, + 2aon, or equivalently a; m -asn = (b, —b,)/2. Note that the right 
side of this equation is an integer since the edge labels in a topograph always have 
the same parity as the discriminant, which is the same for both forms by assumption. 
From Section 2.3 we know the equation am-an = (b,—b,)/2 always has an integer 
solution (m,n) if a, and a, are coprime. Thus we can find edges bordering the a, 
and a» regions with the same label b. The two given forms are therefore equivalent 
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to forms [a,,b,c,] and [a;, b,c], and in fact properly equivalent because of the way 
we have oriented the edges bordering the a, and a, regions. 

Equating the discriminants of these two forms [a,b,c] and [a5,b,c5] leads 
to the equation a,c, = acb. Since a, and a, are coprime this implies that a, 
divides cs, so c3 = a,c for some integer c. The equation a,c} = a>c then becomes 
aci = a a,c, which implies that cj = a c since a, is nonzero. Thus we have two 
concordant forms [a,,b,a,c] and [a,,b,a,c] properly equivalent to the original 
forms [a,,b,,c,] and [a;, bs, c3]. Oo 


Proof of Proposition 7.2: Choose a number a, > 0 in the topograph of Q,. By 
Proposition 6.25 the topograph of Q, contains some number a, > 0 coprime to a,. 
Thus Q; and Q, are properly equivalent to forms [a,,b,,c,] and [a>, b2,c2], and 
then Lemma 7.3 finishes the proof. o 


To illustrate how to multiply forms let us look at a few examples in the case 
A = —104. Here there are four equivalence classes of forms: 


Q,=[1,0, 26] Q,=[2,0,13] 


Q,=[5,4,6] 


Since only the first two forms have mirror symmetry, the class number is 6. We will 
be somewhat free with the notation and use the same symbol Q; to denote any form 
properly equivalent to the original form Q;. 

Let us compute the product of Q, and Q, using the method in the proof of 
Lemma 7.3. To begin we need regions in the topographs of Q, and Q, with coprime 
labels, so the simplest thing is to use the region labeled 2 in the topograph of Q» 
and the region labeled 3 in the topograph of Q}. For the Q, topograph the edge 
between the 2 and 13 regions is labeled 0 so the next edges bordering the 2 region 
are labeled 4,8,12,---. For the 3 region in the topograph of Q; the bordering edges 
are labeled 2, 8, 14,- - - starting with the edge adjacent to the 9 region. The number 8 
is in both these arithmetic progressions so we choose this for b. In the Q, topograph 
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this edge labeled 8 is between the regions labeled 2 and 21 so the form we want is 
[2,8,21]. For Q; the edge labeled 8 is between the 3 and 14 regions so the form 
corresponding to this edge is [3,8,14]. The product of these two concordant forms 
is then [6,8,7]. The values of this form at (x,y) = (0,1), (1,0), and (1,1) are 6, 7, 
and 21 so from the topograph of Q, we see that this form is properly equivalent to 
Q,. Thus we have Q5Q3 = Q4. 

The product Q,Q,, or in other words (es, , can be computed in the same way using 
the regions in the topograph of Q, with the coprime labels 5 and 6. For the edges 
bordering the 5 region the labels starting with the edge between the 5 and 6 regions 
are 4,14, 24,---. For the edges bordering the 6 region we can start with the same 
edge but now this edge must be oriented in the opposite direction in order to have 
the 6 region on our left as we move forward. The edge labels are then —4, 8, 20,---. 
Continuing these arithmetic progressions a little farther we find the common label 44 
on the edge between the 5 and 102 regions, and on the edge between the 6 and 85 
regions. Thus we have the concordant forms [5, 44, 102] and [6, 44, 85], with product 
[30,44,17]. The coefficients 30 and 17 appear in adjacent regions in the topograph 
of Q3 so Qi is properly equivalent to either Q; or the mirror image form. We can 
determine which by evaluating [30,44,17] at (x,y) = (—1,1), giving the value 3. 
Thus in the topograph of [30,44,17] the values 30,17,3 appear in clockwise order 
around a vertex, while in the topograph of Q, they are in counterclockwise order, so 
these two topographs are mirror images and hence Qi is properly equivalent to the 
mirror image form of Q3. 

In these examples there were a number of choices made in order to compute 
the products Q;Q;. Thus for computing Q Q; we first chose the regions labeled 2 
and 3 in the topographs of Q, and Q3, but we could have chosen any region in one 
topograph and then chosen any of the infinitely many regions in the other topograph 
with a label coprime to the label of the first region. After choosing the regions labeled 
2 and 3 we then chose edges bordering these regions having the same label b, and 
there are infinitely many possibilities to choose from here too. For the 2 region the 
edge labels are the integers 8 + 4k and for the 3 region they are the integers 8 + 6k 
so the common edge labels are the integers 8 + 12k, which are in fact all the edge 
labels for the 6 region in the topograph of Q,. It is not at all obvious that the various 
choices that were made for the two topographs always lead to the same result that 
QQ; = Q4. Our next task will be to prove that this is true not just for this calculation 
but in general. 

What we wish to prove is the crucial fact that multiplication of proper equivalence 
classes of primitive forms by choosing a concordant pair of forms in these classes does 
not depend on which concordant pair we choose. This can be phrased in the following 
way: 
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Proposition 7.4. For a fixed discriminant let Q,,Q, be a pair of concordant primi- 
tive forms and let Q},Q; be another such pair properly equivalent to Q, and Q, 
respectively. Then the products QQ» and QQ; are properly equivalent. 


The proof will involve a certain amount of calculation, and to ease the burden 
it will be convenient to express quadratic forms in terms of matrices. This is based 
on the simple observation that a form ax? + bxy + cy”, regarded as a 1 x 1 matrix 
(ax? + bxy + cy’), can be obtained as a product of a 1 x 2 matrix, a 2 x 2 matrix, 
and a 2 x 1 matrix: 


(x y) & a (x) = (ax +>¥/, bx/ + cy) (*) 
= (ax? +bxy+cy* ) 


Thus we are expressing the form ax? + bxy +cy° as a matrix (5 2) where b = b». 
The entries b might not be integers, but this will not matter for our purposes. 
When we do a change of variables by means of a matrix (7 1 with determinant 


Y s$ 
ps—qr = 1,replacing o) by K 1) ) = Ce) , then the product (x vg 2) G) 
becomes (x y) (i r) (GF 2) (7 4) o , with the second matrix being the transpose of 


b 


2) for the form ax? + bxy + cy? is replaced 


the fourth matrix. Thus the matrix ($ 


by the matrix Ca = CG 4 (5 b) (2 4) for the new form a'x? + b’xy +c’y?. We 


can write this last equation as 


1 1 zl 1 1 

p r\fa b\_fa b\/pa\ _fa b s -q 

q4 s}\b c) \b c)\r s}  \b' deji p 
where this last matrix is the inverse of v a since ps — qr = 1. 
Proof of Proposition 7.4: We will use the notation Q ~ Q’ to mean that the forms Q 
and Q’ are properly equivalent. 

Let Q, = [a,,b, a c] and Q, = [ao,b, a,c], with Qi = [a], b’, asc’) and Q3 = 

[a5,b',a{c’]. To begin the proof we look at the special case that Q; = Q}, S0 a, =a}, 
b = b', and a,c = ac’. We assume Q>» ~ Q; so by the remarks preceding the proof 


there is an integer matrix (7 ) of determinant 1 such that: 


p r\(a, b\_fa b s -q 
q s b ac} \b ae’ -r p 
Multiplied out, this becomes: 
a,p+br bpt+aycr\ _( as-br bp-aq (x) 
aoqt+bs bqt+a,cs} \bs-a,c'r a,c’ p-bq 
To show QQ» = Q,Q5 we would like to find an integer matrix (7 a of determi- 
nant 1 such that 


p r aya b\ (aa b C, 
q' s’ b c — b c’ _r’ p 
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ayap +br' bp’ +cr’ = ayas - br’ bp’ - ajaq 
a,a,q'+bs' bg +cs j) ~ e= 


We can convert the upper left entries in the two matrices in (x ) to the corresponding 
entries in (* * ) by multiplying by a, if we choose p’ = p, s’ = s,and r’ = a,r. Then 
the equality of the upper left entries in ( * ) will imply equality of the upper left entries 
in (x*). If we further choose q’ = q/a, then the upper right entries in (*) will be 
equal to the corresponding entries in (+), and the same will be true for the lower 
left entries. The lower right entries in (*) will be a, times those in (* « ) so the lower 
right entries in (+) will be equal as well. Thus we hope to define ($ a) by: 


p a\_({ pala, 
r s ar s 


Note that this matrix has the same determinant as es 4) . The only problem is that 
the entry q’ = q/a, will only be an integer if a, divides q. To guarantee that it does, 
observe that the equality of the upper right entries in (*) implies that a,cr = —a5q, 
so if a, is coprime to a, then a, will divide q. Thus we have proved the proposition 
in the special case Q, = Q} provided that a, and a; are coprime. 

In the case just considered we assumed Q, = Qi which implied that b = b’. Now 
let us assume merely that b = b’ along with the previous hypothesis that a, and a; 
are coprime. Under these conditions the desired equivalence QQ, ~ Q1Q; will be 
obtained as the combination of two equivalences Q,Q> = Q,Q5 = Q1Q3, but first we 
have to check that Q, and Q; are concordant so that Q} Q3 is defined. Since b = b’ 
and the determinants of Q, and Qj are equal, we have a,a c = aja5c’. Since a, 
and a} are coprime it follows that a, divides ajc’. As we saw earlier, this implies 
that the forms Q, = [a,,b,a5c] and Q3 = [a5,b,a}c'] are concordant. 

Assuming that a, and a are coprime, the previous case Q, = Q] now gives 
an equivalence Q,;Q> ~ Q,Q>. Switching the roles of Q, and Q; as well as Q} and 
Q,, this argument also shows Q,Q5 ~ Q1 Q; using the same assumption that a, and 
a, are coprime. We conclude that QQ» = Q1Q; when a, and a; are coprime and 
b =b. 

Next we consider how to arrange that b = b’. The hypothesis that will allow this 
is that aja» is coprime to ajas, which is equivalent to saying that each of a, and 
a, is coprime to each of a} and a}. If aja, and aja, are coprime, we know by 
an argument in the proof of Lemma 7.3 that the arithmetic progressions b + ajam 
and b’ + ajan have a common value B. This will also be a value in each of the 
arithmetic progressions b + am, b + an, b’ + ajm, and b’ + abn. Thus we have 
forms Q; = [a;, B, č] =~ Q; for i = 1,2, and similarly Q; = [a;,B, č] = Q;. 

Let us check that O% and Os are concordant. This will be true if the first coef- 
ficient of one form divides the third coefficient of the other, say a, divides C,. The 
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forms Q; and Q, have the same discriminant so b? —4a,a,c = B*—4a,¢, . Substitut- 
ing B = b+2a,a,m and simplifying, we get —a,a,c = a,a,bm+aiasm* —a,é,. Af- 
ter canceling a factor of a, from both sides, this becomes -a,c = a,bm+a,a5m*—C, 
which implies that a, divides ¢,. Thus 0, and Qz are concordant, and by the same 
reasoning Ò; and Q% are concordant, so we can form the products ÒQ, and 105. 
We have QQ; ~ ÕÕ, since the label B occurs on an edge bordering the region 
labeled a,a, in the topographs of both of these product forms, which is obvious for 
ÕÕ, = [a,a>,B, —] while for QQ, = [a,a, b, —] it follows from the definition of B. 
Similarly QQ), =~ Q{.Q5. We can now apply the previous case b = b’ to the four forms 
Q,,Q5,Q;,Q5 since the leading coefficients a, and a; of the first and fourth forms 
are coprime. Thus we have 0,0, ~ 10; and hence Q,Q) ~ QQ, = 0105 = QIQ}. 
This proves the proposition under the assumption that a,a, is coprime to ajas. 
Now we can finish the proof by reducing to the case just considered, that a,a, 
is coprime to aa. Choose a number A, represented by Q, coprime to a,a,a\a), 
and then choose a number A, represented by Q, and coprime to A,a,a,a;a}. Since 
A, and A, are coprime, Lemma 7.3 implies that there exist concordant forms Q, = 
[A,,B, AoC] and Ô, = [As,B,A,C] with Q, =~ Q, and Q, ~ Qs. Since A, A> is 
coprime to a,a> the previous case implies that Q,Q> = Ô; Ô, . The previous case also 
implies that Ô; Ô, ~ Q1Q3 since A, A, is coprime to a; a, and we have Q, ~ Q, = Q; 
and Â, ~ Q, = Q}. Thus Q,Q) = Q,Q> =~ Q1Q} and we are done. o 


For proper equivalence classes of primitive forms of a fixed discriminant we have 
seen that if two classes represent coprime numbers, then the product class represents 
the product of the two numbers. The next proposition says that we can drop the 
coprimeness condition on the two numbers if we allow “representations” Q(x, y) =n 
with nonprimitive pairs (x,y). 


Proposition 7.5. If Q,; and Q, are concordant forms with product QQ» then each 
product Qı (x1, Y1)Q2(X2, Y2) can be expressed as Q,Q>(X,Y) where X and Y are 
certain explicit functions of (x4, yı) and (x>, Y2) given in terms of the coefficients 


of Q, and Q3. 


Proof: Let Q,(x,y) = ax? +bxy+a,cy? and Q(x, y) = apx? +bxy +a;cy’. It 
will suffice to express a product Q4 (x1, Y1)Q;(X2, Y2) aS a}aX? +bXY +cY° where 
X and Y are given in terms of the coefficients a}, a,c and the variables x1, Y1, X2, Y2. 
First we compute Q,(X1,V1)Qo(Xo, Yo): 


2 2 2 2 
(aixi + bx yı + acyi) laxi + bxy + aıcy5) 
2.2 2 S sD 2 2 
= A4AXĪX5 + AıbXIX Və + AJCXI YS + apbx1Xx5Yı + D° XX 2d V2 
——_— M — M — M —— M 
(1) (2) (3) (4) (5) 

2 Be Le 5 2 aD 

+ aybcx y1 Y5 + ASCXSY{ + AQDCXZV{ V2 + ALAC VIY? 

ee a U U ÃĖ— 


(6) (7) (8) (9) 


Section 7.1 — Multiplication of Forms 211 


There are nine terms here and we label them (1)-(9) as shown. We want to choose X 
and Y so that the sum of these nine terms is equal to a,a,X* + bXY + cY*. Only 
the terms (1) and (9) contain the factor a,a> appearing in a,aX° so to get (1) it is 
reasonable to start with X = x,x». Then to get (9) we expand this to: 


X = X1X £ CVV? 
Here we allow a sign + for flexibility later in the calculation. Now we have: 
x2 = P25 2 
AA = A1A2XIX3 + LA1A2CX1X2 Y1 V2 + Al A2C V1 V2 
—— —— m 


(1) (9) 


This gives (1) and (9) but the middle term does not appear among (1)-(9) so we will 
have to have something that cancels it later. 

Next, to get the term (2) we start with Y = a,x,» so that bXY starts with 
abx? xy» which is (2). For symmetry let us expand Y = a,x, to: 


Y = 41X1 V2 + AnX2V 
This gives: 


DXY = abx? x y, + Asbx,xXSY, + ALDCX V1 VS + apbcx y y» 
C A E 


(2) (4) (6) (8) 
d y? 2 ae. 2 2 2ye 
an C = AICXI Vo + LA1A2CX1X2 V1 V2 + A2CX> V1 
— — 
(3) (7) 


If we choose the sign + in X to be minus then the middle term of cY? cancels the 
middle term of aja X 2 but this gives the terms (6) and (8) in bXY the wrong sign 
so we will need other terms to compensate for this. We have also not yet accounted 
for the term (5). To get this let us add another term to Y so that X and Y are now: 


X = XiX — CY 2 
Y = A,X, V2 + A2X2Vı + DY 2 
Then we have: 
2 DeD 2 
A, AX" = A1A2XI X3 — 2A1A2CX1X2 V1 V2 + A1A2C V1 V2 
— a 
(1) (9) 
2 2 2 2 
DXY = A, bx{ X22 + a2bx1X3Vı — Ay DCX y1 Vy? — A2bCX2Y{ V2 
a) ———_— ————_ a 
(2) (4) (6) (8) 
2 id D4 
+ bfx X Y1 Yə — b cyi y? 
a 
(5) 
cY? = alexty? + 2ayayex,x,919 + abex}y} 
— — 
(3) (7) 
2,2,2 2 2 
+ bfcyi ys + 2a,bCX, V1 V5 + 2apbcx Yi yə 
—— m —— m 
(6) (8) 
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Now when we add everything up, the unlabeled terms cancel and the remaining terms 
combine to give precisely the terms (1)-(9). o 


As a very simple illustration let us consider the case A = —24 where there are 
the two reduced forms [1,0,6] and [2,0,3]. The form [1,0,6] is concordant to it- 
self and we have [1,0,6][1,0,6] = [1,0,6]. Also [1,0,6] is concordant to [2,0,3] 
and we have [1,0,6][2,0,3] = [2,0,3]. However [2,0,3] is not concordant to it- 
self, although it is concordant to [3,0,2] which is equivalent to [2,0,3] and in fact 
properly equivalent to it since both forms have mirror symmetry. Thus we have 
[2,0,3][3,0,2] = [6,0,1] which is properly equivalent to [1,0,6]. If we apply the 
preceding proposition with Q, = [2,0,3] and Q, = [3,0,2] then we have a, = 2, 
a, = 3, b = 0, and c = 1, so the formulas for X and Y are X = xX — Yı Y and 
Y = 2X12 + 3X2. The calculations in the proof then give: 


(2x? + 32) (3x5 + 25) = 6X? + Y? = 6(x1 X9 — V1Vo)* + (2X1 Yo + 3X01)° 


To express this in terms of the original two forms [1,0,6] and [2,0,3] we change 
variables by switching x, and y, and then we interchange the two terms on the right 
to get: 


(2x7 + 3y{) (2x5 + 35) = (2X1 xX. + 3V1V2)" + 6(X1V2 — X21)" 


This shows explicitly that the product of two numbers 2x7+3y* isanumber x°+6y°. 
In a similar way we can obtain formulas for the other products: 


(x7 + OYT) (x5 + 6S) = (X1X2 — OY Vo)* + 6(X1 V2 + X21)? 
(x} + 6V1) (2x3 + 33) = 2(X1X2 — 3V1V2)" + 3 (1 V2 + 2X21)" 


Other discriminants can be handled in the same way although the calculations can 
become complicated. One would start with a list of forms, one for each proper 
equivalence class of forms of the given discriminant. For each pair of forms on the 
list one would find a properly equivalent pair of concordant forms [a,,b,a,c] and 
[a»,b,a,c], with suitable changes of variables to convert the given pair of forms to 
the concordant pair. Then one would apply the formulas for X and Y in the proof 
of the preceding proposition, and finally one would do another change of variables to 
convert a,a,X* + bXY + cY* toa form on the original list. 


Exercises 


1. In discriminant A = —56 we have the forms Q, = [2,0,7] and Q; = [3,2,5]. 
Compute Q Q; and Qs by finding suitable pairs of concordant forms. 


2. Find all the concordant pairs of forms [3, b,c, ] and [5, b, cy] of discriminant —120. 
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7.2 The Class Group for Forms 


In the previous section we defined a method for multiplying any two elements of 
the set CG(A) of proper equivalence classes of primitive forms of discriminant A, 
which was to choose a pair of concordant forms Q, = a,x? +bxy+a,cy* and Q, = 
ax’? + bxy+ay,c y? in the two proper equivalence classes, and then the product of 
these two classes is the class containing the form Q,Q5 = ajax? +bxy+cy*. Note 
that the form Q,Q, is the same as Q Q; since aja = apa; so this multiplication 
operation in CG(A) is commutative. 

The multiplication operation in CG(A) has a few other simple properties. A form 
La, b,c] is concordant to [1,b,ac] and [a,b,c][1,b,ac] = [a,b,c]. Since [1,b,ac] 
represents 1 it is equivalent to the principal form, hence properly equivalent to it 
since the principal form has mirror symmetry. Thus the class of the principal form 
in CG(A) is an identity element for the multiplication. 

Each form [a,b,c] is concordant to its mirror image form [c,b,a], and their 
product is [ac,b,1] which represents 1 hence is properly equivalent the principal 
form. Thus all elements of CG(A) have inverses for the multiplication operation, 
obtained by taking mirror image forms. 

Forms whose topographs have mirror symmetry give elements of CG(A) that are 
equal to their inverses. The converse is also true since if a topograph is properly 
equivalent to its mirror image, this says it has an orientation-reversing symmetry and 
all such symmetries are mirror reflections by Proposition 5.8. 

Another basic property of the multiplication operation in CG(A) is that it is asso- 
ciative, although proving this takes a little more work. To do this we start with three 
forms Q,,Q>,Q3 giving three classes in CG(A). Choose a number a, in the topo- 
graph of Q,, then a number a, in the topograph of Q, coprime to a,, then a number 
a; in the topograph of Q, coprime to a,a,. Each Q; is then properly equivalent to 
a form [a;, b;, ci]. Since each a; is coprime to the other two, the Chinese Remainder 
Theorem guarantees that there is a number b congruent to b; mod a, for each i. 
We would like these congruences to be mod 2a; instead of just mod a;. To arrange 
this we go back and first choose a, coprime to 2, then a, coprime to 2a ,, then a; 
coprime to 2a,a, so each a; is odd. Next, when we apply the Chinese Remainder 
Theorem we find b congruent to each b; mod a, and also congruent to A mod 2, 
hence also congruent to each b; mod 2. Then b will be congruent to each b; mod 2a, 
since 2 and a, are coprime. 

Having chosen b in this way, each form [a;,b;,c;] is properly equivalent to a 
form [a;,b,c;]. Equating discriminants of the first two of these new forms, we see 
that a,c; = apC so a» divides a,c; and hence it divides c] since a, and a, are 
coprime. Similarly a3 divides c;. Since a and a; are coprime this means that a5a; 
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divides cj and we can write c| = a,a;c for some integer c. Equating discriminants 
then gives c3 = a,a3c and c} = a,a5c. Thus we have the three forms [a,,b,a,a3c], 
[a»,b,a,a3c], and [a3,b,a,a,c], and each pair of these forms is concordant. If 
we multiply the first two forms we get [a ap, b,a}c], and then multiplying this by 
the third form [a3,b,a,a c] gives [a,a.a3,b,c]. We get the same result if we first 
multiply the second and third forms and then multiply their product by the first form. 
This proves associativity. 
We have now shown the following basic fact: 


Proposition 7.6. CG(A) is a group, that is, the multiplication is associative, there 
is an identity element whose product with any element is that element, and each 
element has an inverse, so that the product of an element and its inverse is the 
identity element. 


For general groups the multiplication operation is not required to be commutative, 
and this complicates the definition slightly. The identity element is required to act as 
an identity when it is multiplied on both the right and the left. Thus there must be an 
element e such that both ge = g and eg = g for all elements g in the group. Similar, 
inverses are required to be inverses for both multiplication on the right and on the 
l satisfying both gg! =e 
and g 'g = e. However, since the multiplication in CG(A) is commutative these 
left-right subtleties do not arise. Noncommutative groups often arise quite naturally, 


left, so each element g must have an inverse element g` 


and we have in fact already made extensive use of one, the group of linear fractional 
transformations LF(Z). This also differs from CG(A) by having an infinite number 
of elements, while the number of elements of CG(A) is the class number h, which 
is always finite. 

We should observe that the identity element in a group is always unique since if 
two elements g and h both act as the identity then gh = h since g is an identity, 
but we also have gh = g since h is anidentity, so g = h. Another general fact is that 
each element g in a group has a unique inverse since if h and h’ are two possibly 
different inverses for g, so both gh and gh’ are the identity, then we have gh = gh’ 
so after multiplying both sides of this equation on the left by any inverse g7! we get 
h=h'. 


We can now re-examine some of the examples in Section 6.1 to verify that the 
conjectured group structures on CG(A) are in fact correct. 

First consider the case A = 40. Here there were two equivalence classes of forms, 
given by Q, = x°-10y" and Q = 2x*—5y*. Both topographs have mirror symmetry 
so proper equivalence is the same as equivalence. Thus the group CG(A) has two 
elements, and we will use the same symbols Q, and Q, for these elements of CG(A). 
The identity element of CG(A) is Q, since this is the principal form. Since Q» = Q; 1 
by the mirror symmetry of its topograph, we have QQ; = Q,, the identity element 
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of CG(A). This determines the group structure in CG(A) completely, and it agrees 
with what we predicted from the topographs in Section 6.1. 

Next consider the case A = —84 where there were four equivalence classes of 
forms Q,, Q2, Q3, and Q,. All four topographs have mirror symmetry so CG(A) 
has four elements. The principal form Q, gives the identity element, and Q;Q; = Q, 
for each i by the mirror symmetry. It remains to determine the products Q,Q;, 
Q2Q,4, and Q3Q,. For Q2Q3, this cannot be Q, otherwise Q; would be Q3}. Also 
Q-Q; cannot be Q, otherwise Q; would be the identity element Q,. Similarly, QoQ; 
cannot be Q;. Therefore we must have QQ; = Q4. The same reasoning shows that 
Q2Q4 = Q; and Q3Q4 = Q3. 

In more complicated cases it can be helpful to use the fact that if two primitive 
forms Q, and Q, of the discriminant A represent coprime numbers a, and a, then 
their product Qı Q> represents a,a,. This is a consequence of results in the previous 
section, particularly Lemma 7.3. For example in the preceding case A = —84 we could 
also show that Q,Q3 = Q, by looking at the topographs to see that Q, represents 3 
and Q, represents 2 so QQ, must represent 6. The only element of CG(A) whose 
topograph contains 6 is Q4, so Q2Q;3 = Q4. Similarly one sees that QoQ, = Q; using 
the numbers 3 and 5, and Q3Q, = Q, using 2 and 5. We could also deduce the last 
two formulas from Q Q; = Q, by multiplying both sides by Q, or Q3. 

The next example from Section 6.1 is A = —56 where there were three equivalence 
classes of forms Q,, Q>, and Q3. For Q; and Q, the topographs have mirror symme- 
try but not for Q, so there is another form Q, whose topograph is the mirror image 
of the one for Q3, with Q; = Oy in CG(A). Again we have Q, the identity in CG(A) 
and we have QQ» = Q, by mirror symmetry. However it is not so easy to determine 
Q3Q3. The topograph of Q; contains 3 and 5 so the topograph of Q3Q3 must con- 
tain 15, but 15 is in the topographs of both Q} and Q, so this is inconclusive. The 
same thing happens for other pairs of primes in the topograph of Q, suchas 3, 13 or 
5,19. However, since the topograph of Q, does not have mirror symmetry, we know 
that Q; isnot Q3;' hence QQ; is not Q, soit must be Q,. Thus all four elements of 
CG(A) are powers of Q}, namely Q3, Q5 = Qo, Q3 = Q5 = Q,, and Q3 = Q, since 
Q3 = Q; implies QÌ = Q3' which is Q,. This determines the structure of CG(A) 
completely. For example Q Q4 = Q3Q3 = Q3 = Q; since Q3 = Q}. 


In the preceding examples the group CG(A) was small enough that its structure 
could be determined just from the topographs. This is not always the case in more 
complicated examples, however. One difficulty is that a form Q and its inverse Q`! 
have mirror image topographs containing exactly the same numbers, so from the 
topographs one may be able to compute a product Q;Q; = OF but one cannot always 
tell which exponent +1 or —1 is correct. Another problem is that some numbers can 
appear in more than one topograph. 

We illustrate these difficulties with an example, discriminant A = —104 where 
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we showed the topographs of the four equivalence classes of forms in the previous 
section. Since the first two forms Q, and Q, have mirror symmetry while the second 
two Q, and Q, do not, the group CG(A) has six elements, with the principal form 
Q; the identity and Q5 = Q,. From the product 3-17 = 51 we see that Q$ is Q}, 
Q3, or Q3 1 put Q, is ruled out since the topograph of Q, does not have mirror 
symmetry, and Q; is ruled out since Q3 = Q} would imply Q, = Q,. Thus Q3 = Q3}, 
or equivalently, Q3 = Q,. Similarly, we can try to compute Qi from the product 
5:7 = 35 which appears in the topographs of Q, and Q3;. The possibility that Qi is 
Q, is ruled out since Q, does not have mirror symmetry. Thus Qi = Oe but we 
cannot tell which exponent is correct from the topographs and the argument we used 
to compute Q3 does not work here. In fact we computed Q$ in the previous section 
by finding a pair of concordant forms properly equivalent to Q,, and it turned out 
that Qi was Q3 l the mirror image of Q3. 

Let us see what the higher powers of Q, are. Note first that Q§ = (Qį)? = 
(Q3!) = Q, since (Q3!)° is the inverse of Q3 = Q]. From Q§ = Q, we obtain 
Q3 = Q3! and Qi = Qi" = Q3. For Q} we have (Q3)? = QS = Q; SO Q3 has mirror 
symmetry making it either Qı or Qz, but Q? = Q; is impossible since it would say 
that OF is Q3 1 rather than Q3!. Thus Q} = Q, and so the six elements of CG(A) 
are the powers Qi for i = 1,2,3,4,5,6 with Qs the identity. This determines the 
multiplication in CG(A) completely. We will see in Section 7.3 that a group with 
six elements and commutative multiplication always contains an element whose first 
through sixth powers are all the elements of the group. 


Now we come to our main application of the class group, which is to the problem 
of determining which primitive forms of a given discriminant A represent a given 
number n. It will suffice to consider only the case that n is positive. This is no 
restriction when A < 0 since there is no need to consider elliptic forms with negative 
values. When A > 0, if we know which forms represent positive n then the negatives 
of these forms will be the forms representing —n. The only forms representing 1 are 
the forms equivalent to the principal form so we can assume n > 1. 

Here is the main result, where for convenience we continue to use the same symbol 
for a primitive form and for the element of CG(A) that it determines: 


Theorem 7.7. (a) Let a number n > 1 be factored as n = p{' --- p for distinct 
primes p; with e; > 0 for each i. Then the primitive forms of discriminant A that 
represent n are the products Q,---Q, where Q; is a primitive form representing 
p: 

(b) The forms of discriminant A representing a power p° ofa prime p not dividing 
A are primitive and are exactly the forms Q=? where Q is a form representing p. 
If p divides A but not the conductor then the only power of p represented in 


discriminant A is p itself, and it is represented by a primitive form. 
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The theorem says nothing about the primitive forms that represent powers of a 
prime dividing the conductor, and indeed this is a delicate question as the examples 
in the large table in Section 6.2 show. In the first statement in (b) the form Q is unique 
up to equivalence by Proposition 6.15. It may or may not have mirror symmetry, so 
Q and Q7! may be different elements of CG(A) and the same is true of Q? and Q~°. 
In the second statement of (b) a form Q representing p is unique up to equivalence 
and is symmetric by Proposition 6.17 so there is no need to consider Q™. 

If A is afundamental discriminant then the conductor is 1 so the theorem gives a 
full reduction of the representation problem for nonprimes to the corresponding prob- 
lem for primes: The forms representing pî’ - - - pg are the products QI“ -- - Q“ 
where Q; represents p; and e; = 1 if p; divides A. For nonfundamental discrimi- 
nants one obtains all primitive forms representing pj! - - - pit by modifying the pre- 
vious statement to allow some of the primes p; to divide the conductor, replacing the 
corresponding terms OQ by any primitive forms Q, that represent ae 

As a special case, the only forms representing a power p° of a prime p not divid- 
ing the discriminant are Q and Q ° where Q represents p. Since Q © is the inverse 
of Q? in CG(A), these two forms are equivalent so there is only one equivalence class 
of forms representing p°. When p is odd this was proved in Proposition 6.15, and 
now we see that it holds also for p = 2. 

When there are two or more distinct prime factors p; the choices between Q“ 
and Q-“ can lead to nonequivalent forms representing the same number. For ex- 
ample for a product pp of two different primes there can be four different proper 
equivalence classes 0; 05" for the four choices of signs, and these can give two 
different equivalence classes, even if Q; = Q2. 


Proof of Theorem 7.7: If n is represented by a form Q then Q is properly equivalent 
to a form [n, b,c]. If n factors as n = a,a,---a, then [n,b,c] factors as [n, b,c] = 
[a,,b,nc/a,|[n/a,,b, a,c] with the latter two forms being concordant. If k = 2 this 
gives [a;a, b,c] = [a,,b,aoc]\[ao,b,a,c]. If k > 2 we can factor [n/a,,b,a,c] 
further as [a»,b,nc/a>][n/a;a>,b,a;a>c]. Continuing in this way, we eventually 
get: 


[n, b,c] = [a;, b, nc/a; llap, b, nc/a>] se, [ay,b,nc/a, | 


Here any two forms in the product on the right are concordant. 

In particular for the prime factorization n = pj{'-- pe we have [n,b,c] = 
Qi- Qg for Q; = [p%',b,nc/p;'], a form representing p;'. By Lemma 7.1 the 
form [n, b,c] is primitive if and only if each Q; is primitive since the primes p; are 
assumed to be distinct. This proves half of statement (a), that each primitive form 
representing n can be expressed as a product Q,---Q, with Q; a primitive form 
representing pr: The other half is the statement that a product Q; -Qx is prim- 
itive and represents n if each Q; is primitive and represents Dil. This follows by 
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applying Lemma 7.3 repeatedly, first to forms [p{!,b,,c,] and [ps’, b2, c>] properly 
equivalent to Q, and Q,, then to the product of the two resulting forms and a form 
[p$ b3,cC3] properly equivalent to Q3, and so on. 

For part (b) of the theorem, a form representing p° is properly equivalent to a 
form [pf, b,c]. As above, this factors as [p°,b,c] = [p,b, p® ‘c]®. If p does not 
divide the conductor then the forms Q = [p,b,p®!c] representing p and Q? = 
[p°,b,c] representing p° are primitive by Proposition 6.14. Since forms represent- 
ing primes are unique up to equivalence, any form representing p must be properly 
equivalent to Q or Q~!. Hence the form we started with that represents p° is properly 
equivalent to the e™ power of Q or Q7}, that is, to Q° or Q~°. 

If p divides A but not the conductor then Proposition 6.7 says that p is repre- 
sented by a form of discriminant A but no higher power of p is represented. The 
form representing p is primitive by Proposition 6.14. o 


Let us look at a few examples. For A = —56, a fundamental discriminant, we 
have already determined the group structure of CG(A) which has four elements, but 
we can use the preceding Theorem 7.7 to quickly rederive the group structure from 
the topographs which were shown in Section 6.1. For this it suffices to look just at 
how the powers of 3 are represented. Since 3 is represented by Q; = [3,2,5] it 
follows that 3! is represented by OF. The topographs show that 3° is represented 
by Q, = [2,0,7] so Q5 = Q3!, but Q» = Q3! since the topograph of Q, has mirror 
symmetry, so we have Q5 = Q,. Next, 33 is represented by Q3 so Q3 = Q3', but 
Q3 = Q; would imply Q = Q, contradicting the fact that Q§ = Q, so Q3 = Q3}. 
And finally 34 is represented by Q, = [1,0,14] so Q3 = Q;' = Q,. Thus we see 
again that CG(A) consists of the powers of Q3, with Q3 the identity. 

From this we can determine which forms represent a number n = pj'-:-- pe; 
with e; < 1 for p; = 2,7. Changing notation for convenience, let Q be the form 
[3,2,5] previously called Q3, so the other three forms are powers of Q. According to 
the theorem, the forms representing n are the products (Q“')**! - - . (Q%)=** where 
Q* is the power of Q representing p;. We may assume each q; is 0, 1, or 2 since 
Q? = Q~! represents the same numbers as Q. The product (Q@)** - - . (Q4)** is 
then a power Qf where only the value of e mod 4 matters. Primes p; represented 
by Q, the identity in CG(A), can be ignored. Then we have 


e=) +e,+ > +2e, 
Q Q? 


where the first sum is over subscripts i such that p; is represented by Q and similarly 
for the second sum with Qĉ° in place of Q. The sign + in the second sum can be 
ignored since Q? = Q~*. As we saw in Section 6.3, the forms Q? and Q? make up 
one genus while Q and the equivalent form Q? = Q7! make up the other genus. The 
parity of e thus determines the genus of the forms representing n. (Recall that forms 
representing a given number all belong to the same genus.) From the formula for e 
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we can deduce that n is represented by both Q? and Q? exactly when e is even and 
at least one e; in the first sum is odd since this is the only time when the choice of 
the signs + matters. 

As another example, when A = —104 we computed CG(A) to have six elements, 
the first through sixth powers of the form Q, = [5,4,6] with Qs = Q,, the identity 
in CG(A). We can obtain most of this structure a little more efficiently now using 
Theorem 7.7. Looking at the topographs, we see that 5, 5°, and 5° are represented 
by Q4, Q3, and Q, so Q4 = Q3" and Q} = Q3! whichis Q, since the topograph of 
Q, has mirror symmetry. Since Q5 = Q; it follows that Qs = Q5 = Q; so Q3 = Q3! 
and Qj = Q3° = Q3'. We cannot determine which sign in Q{ = Q3! is correct just 
from the topographs, but we showed that Qi =Q3 earlier. 

The forms representing a number n = pj! - - py when A = —104 can be de- 
scribed in a similar way to the preceding example with A = —56. For A = —104 the 
exceptional primes p; with e; < 1 are 2 and 13. The forms representing n are the 
products (Q#)**!.--(Q%*)** where Q“ is the power of Q = Q, representing p; 
with q; either 0, 1, 2, or 3. Writing this product as Q? where only the value of e 
mod 6 matters, the formula for e now has another term: 


e=) +e,+ > +2e,+ > +3e, 
Q Q? Q3 


The parity of e again determines the genus, with one genus consisting of Q? and 
Q? (which is equivalent to Q*) and the other genus consisting of Q and Q? (with 
Q? = Q7! equivalent to Q). From the formula for e one could work out when a 
number is represented by both forms within a genus and when it is represented by 
only one form. Note that for the formula above it does not matter whether Qi is Qs 
or Q3;' since both these forms represent the same numbers. 


Exercises 


1. For discriminant A = —47 show the class number is 5 and determine the multipli- 
cation rules for the five proper equivalence classes of forms. 


2. Determine the numbers represented by each of the two forms [1,1,6] and [2,1,3]. 


3. Show that the numbers represented by x° + 4y* are the numbers 2’"p, -- ` Pk 
where m is 0, 2, or 3 and each p; is a prime congruent to 1 mod 4. 


4. Show that if two forms Q, and Q, in the class group CG(A) represent coprime 
numbers n, and n, then their product Q,Q> represents n,n». Give an example 
where this fails without the coprimeness assumption, even if n, and n, are coprime 
to A. 


5. For a fixed discriminant A consider the set S, of primes that do not divide the 
conductor and are represented by primitive forms with mirror symmetry. Show that 
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numbers that are products of primes in Są are represented by at most one form of 
discriminant A, up to equivalence, and this form has mirror symmetry. 


7.3 Finite Abelian Groups 


A group whose multiplication operation is commutative is usually referred to as 
an abelian group, after the mathematician Niels Henrik Abel (1802-1829), although 
the term “commutative group” is sometimes used as well. The aim of this section is 
to explain the structure of abelian groups with finitely many elements. This structure 
is far simpler than for finite nonabelian groups which can be extremely complicated, 
with no hope of being completely classified. 

The number of elements in a group G is called the order of G. This can be finite 
or infinite, but for the class group CG(A) it is always finite since it is just the class 
number for discriminant A. 

For an element g in a group G the smallest positive integer n such that g” is 
the identity is called the order of g if such an n exists, and otherwise the order of 
g is said to be infinite. Each element g in a finite group G has finite order since the 
powers g, g°, g?,--- cannot all be distinct elements of G, so we must have g” = g” 
for some m + n, say m < n, and then if we multiply both g™ and g” by g ™, the 


inverse of g™, we see that g" ™ 


is the identity. Thus some positive power of g is 
the identity, and the smallest such power is the order of g. The identity element of a 
group always has order 1 and is obviously the only element of order 1. 

If an element g of a group G has order n then all the powers g,g*,g°,---,g" 
must be distinct elements of G, otherwise if two of these powers gf and gÍ were 
equal with i < j we would have g~t equal to the identity, with j — i < n, contrary 
to the assumption that g has order n. If g has order n then the higher powers 
g"*!/g"**,... just cycle through the powers g,g*,:--g" repeatedly. In particular 
the only powers of g that are the identity element of G are the powers ge for 
integers k. The negative powers of g are just the inverses of the positive powers, and 
these cycle through the same sequence g,g*,---,g" in reverse order since g7} = 
g" +, g’? =g"?, and so on. 

If g has order n then the order of each power g* can be determined in the 
following way. The order of g* is the number m such that mk is the smallest multiple 
of k that is also a multiple of n. The smallest common multiple of k and n is 1/4 
where d is the greatest common divisor of k and n, as one can see by comparing the 
prime factorizations of k and n. Thus mk = k"/q so m = "/q and the order of g“ 
is "7d: 

In particular if g has order n = kl, then g* has order l. This means that for 
each divisor l of n there is a power of g having order l. 
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For example if g has order 6 then g* and g* have order 3, g? has order 2, and 
g? has order 6. Similarly, if g has order 12 then g* and gt? have order 6, g? and 
g? have order 4, g* and g® have order 3, and g’, g’, and g! have order 12. 

A finite group G is called cyclic if there is an element g € G such that every 
element of G is a power of g, so the elements g,g*,g°,--- cycle through all the 
elements of G. The element g is then called a generator of G. Cyclic groups are 
automatically abelian since g*g! and g'gk both equal ge If a generator g of a 
cyclic group G has order n, then this is also the order of G since all the powers 
g, g°, g?,- - -, g” must be distinct, as noted earlier. Thus a group of order n is cyclic 
exactly when it contains an element of order n. In a cyclic group there are generally 
a number of different choices for a generator since if g is one generator of order n 
then g* is a generator exactly when it has order n, which is equivalent to k being 
coprime to n. The number of different generators is thus (n) where g is the Euler 
phi function. 

Among the groups CG(A) that we computed in the previous section, CG(A) is 
cyclic of order 4 for A = —56 and cyclic of order 6 for A = —104, but for A = —84 
the group is not cyclic since it has order 4 but each element other than the identity 
has order 2. 

Cyclic groups are easy to understand, and our next goal is to see that all finite 
abelian groups are built from cyclic groups by a fairly simple procedure. Given two 
groups G, and G,, the product group G, x G, is defined to be the set of all pairs 
(91,92) with g, € G, and g» € G,. The multiplication operation in G} X G, is defined 
by (91,92) (g1, 92) = (J191, J292), SO the coordinates are multiplied separately. The 
identity element of G, x G, is the pair (g1, g2) with g, the identity in G, and gp the 
identity in G. The inverse of an element (g,,g>) is (g1 L Go 1). More generally one 
can define products G, X --- X G, of any collection of groups G,,---,G,, with the 
elements of this product group being k-tuples (g,,---,9,) with g; € G; for each i. 
One can also iterate the process of forming products of groups but this gives nothing 
new since for example (G, X G3) X G3 is really the same as G, X G, x G; by rewriting 
its elements ((9),92),93) aS (g1, 92, 93). 

If G, and G, are finite groups of orders n, and n>, then Gi x G, has order n,n» 
since the two coordinates g, and g, of pairs (g1, g2) in G, x G, vary independently 
over G, and G,. For an element (g1, g2) in G, X G3, if g, has order n, and g, has 
order n, then the order of (g4, g2) is the least common multiple of n, and n, since 
a power (g1, g2)” = (gï, g7) is the identity exactly when n is a multiple of both n; 
and n,, so the order of (g4, g2) is the smallest such multiple. In particular, if n, and 
N, are coprime then (g,,g.) has order nın». This leads to the following interesting 
fact: 


Proposition 7.8. If G} and G, are cyclic of coprime orders n, and n, then G} x G> 
is cyclic of order nnz. 
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Proof: If g, is a generator of G, of order n} and g,» is a generator of G, of order n, 
then (g1, g2) has order n,n, if n; and n, are coprime, as we saw above. The group 
G, X G, is therefore cyclic since it contains an element whose order equals the order 
of the group. oO 


Now we come to the main result in this section, the basic structure theorem for 
finite abelian groups: 


Theorem 7.9. Every finite abelian group is a product G} x--- x G, of cyclic groups 
G,,:-+,G,, with the possibility k = 1 allowed when the group itself is cyclic. 


For the proof we will use the notation o(g) for the order of an element g € G. The 
identity element of G will be written simply as 1. We need two preliminary lemmas. 


Lemma 7.10. If two elements g, and g, of a finite abelian group have coprime 
orders 0(g,) and 0(g>) then their product gg, has order 0(g,)o(g>). 


This need not be true if 0(g,) and o(g») are not coprime. As an extreme example 
take g, to be g, |. Another example would be to take g; to be an element of maximal 
order in G and g, any element with o(g,) > 1. 


Nın2 ~N1N2 


Proof: Let n; = 0(g,) and n, = 0(g>). Then (g,go)"'"™ = g1" g> 

suffice to show that if (g,g.)" = 1 then n is a multiple of n,n). 
Suppose (g,g>)" = 1 and let g = g} = g5". Then g™ =g;"' = (g}')" =1 so 

o(g) divides n,. Similarly, g" = g,"" = (g5*)" = 1 so o(g) divides n,. Since 


nı and n, are assumed to be coprime, this means o(g) = 1 and hence g = 1. Thus 


= 1 soit will 


gi = 1 and gz” = 1, which implies g} = 1. Since gj = 1 it follows that n is a 
multiple of nį, and n is also a multiple of n, since g} = 1. As n, and n, are 
coprime, this implies that n is a multiple of nns. oO 


Lemma 7.11. For a finite abelian group G let m be the maximal order of elements 
of G. Then the order of each element of G is a divisor of m. 


Proof: Suppose this is false, so there is an element g such that o(g) does not divide 
the maximal order m. This means there is some prime power pk dividing o(g) such 
that the highest power p! dividing m has l < k. Since p“ divides o(g) there is a 
power of g having order p*. Let gı be this power of g and let g, be an element 
of G of order m /p', for example hP”' where h is an element of order m. Then by 
the preceding lemma the product g,g, has order pk(m / p’) which is greater than m 
since k > l. This contradicts the maximality of m, so we conclude that o(g) divides 
m forall gEG. o 


Proof of Theorem 7.9: Let g, be an element of G of maximal order n,. If every 
element of G is a power of g, then G is cyclic and there is nothing more to prove. If 
there are elements of G that are not powers of g, then we proceed by induction to 
find further elements gz,- +, gq satisfying the following two properties: 
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(1,) The elements g1, 92,:--,g, have orders n4, Ng, ':,ng where n; > 1 for each i 
and n; is divisible by n;,, for each i <q. 

(24) IE g“ . git = gh -e ga" then k; = k; mod n; for each i. Since each g; 
has order n; an equivalent statement is that if gr e gi! = ge vs San with 
0 < k; <n; and 0 < k; < n; for each i, then k; = k; for each i. 

If we have elements g),---,g, satisfying (1,) and (2,) such that their products 

g” . git give all the elements of G, then by rewriting each product oe . git 

as a q-tuple gt, vee 95") we see that G is a product of cyclic groups of orders 

nN,,*++,N, and the proof will be complete. 

If the products gn . ga do not account for all elements of G then we will 
show how to find another element g,,, of order n,,, so that the conditions (1,,,) 
and (2,,,) are satisfied. This process can be iterated until all elements of G are 
exhausted since at each step the number of products gt . gai increases, at least 
doubling in fact, and G has only finitely many elements. 

Assume inductively that we have already chosen elements gj,---,g, satisfy- 
ing (1,) and (2,). To find g,,, we consider congruence classes of elements of G 
mod gı, '-, gq, which means that we consider each element g as congruent to all 
the products gg“ . ‘Ge for arbitrary exponents k;. Let [g], denote the congru- 
ence class of g, the set of all the elements gg” . ig In particular [g]; includes g 
itself by choosing each k; to be 0. It is not hard to see that these congruence classes 
[g]; form an abelian group with the product defined by [g], [g'] qm [gg'] q- Let this 
group of congruence classes [g], be denoted [G],. In particular when q = 0 we 
start with [G]y = G before we have chosen any of the elements g;. We then start 
the induction by choosing g, to be an element of G = [G], of maximal order n,. 
Conditions (1, ) and (2, ) are then obviously satisfied. 

For the induction step, if there are elements of G that are not products gk e. gn 
then [G]; has more than one element. Let [g,,,], be an element of [G], of max- 


imal order n,,, in [G],. First we check that n,,, divides n,. Since apala = 


[1], we have Gait = gt o git for some exponents k;. Then in [G],_; we have 


Ree is = [1],-; since Lemma 7.11 implies that all elements of [G],_, have order 
dividing the maximal order, which is n, by the inductive definition of n,. The equa- 
tion geal. = [1],-; means that Ce is a product of powers of g),---,9g-1, SO 
it is certainly a product of powers of g),---,g, which means CA a = [1],. Thus 
n, is a multiple of n,,,, the order of [g,,,], in [G],, as we wanted to show. Since 


(1,) holds by inductive assumption, it follows that n,,, divides each n; with i < q. 


i i k 
It is also true that n,,, divides each k; in the formula Ge = Gries ga a TO 


see this, consider the power gy',. We can write this as gi, = (gi) = 

(gk -g with n;/Ng+ı an integer since n,,, divides n;. We can also 
. Nj l li- . Ni Ni 

write gj; as a product g;'--- gj; since [ggiiJi-1 = [Jq+1l;1 = [1]i-1 as a con- 


ka) ni/Ması 
q4 
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sequence of the definition of n; as the maximal order of elements of [G];_,, so all 
elements of [G];_, have order ere n; by Lemma 7.11. Since the two expres- 
sions (gk! T giy "an and Ges . gE 1 for CRR are equal with g; not appearing 
in the second expression, the property (24) implies that the exponent k;n;/Ng+ı on 
gi in the first expression must be a multiple of n,, the order of g; by (1;). Thus we 
have k,n;/ng,, = mn, for some integer m. Canceling n; from this equation, we get 
ki/Ng+1 = M SO Ng, divides k;. 

Next we would like to find an element g,,,g{' --- Gx 
Ji,***sGq and having order n,,,; in G. The order of Re . Ga 
less than n,,, since it determines the same element of [G], as g,,, and [941], 


congruent to g,,,; mod 


1 cannot be 


has order n,,, in [G],. This means that we just need to find exponents x; so that 


X1 Xq\Ngq+1 _ Nq+1 kq : 
(Jag g4)" = 1. Since gaii = gf -> + ga’ we have: 
Xı Xq\Nq+1 Ng+1 _X1Ngq+1 XqNqti _ ky +X1Nq+1 kqg+XqNq+1 
(GatiGt °° Ga") = Gg D “++ Jaq = 9 Gq 


This will be 1 if kj + xjng4; = 0 for each i. Solving kj + xjng.; = 0 for x; gives 
—k;/Ng., with x; an integer since we have shown that n,,, divides k;. 
Having found an element g,,,9{' +° g4’ of order Ng+1, We replace g,,, by this 

element, so the new Ia: +1 has = Ngq+1 in G. It remains to check condition (24+1). 


1 
ką _kq+ı kot kq+1 kos 


If gh... ghtgkt} = gf! ---gq"gqii then in [G], we have logak = Orah 
Since the order of ey in [G]; is n,,, this implies that k,,) = kaxi mod 1,41; 


hence oat = git! in G since g, a has order n,,,;. We can then cancel gasi 
e ai, from the equation gi! --- gq G0 = = gi - Ja AT to get ght... g% = 
g“ gai. Since condition (2, ) holds by induction, we have k; = k; mod n; for each 
i < q. Thus (2,,,) holds and we are done. Oo 


To illustrate how the preceding proof works, suppose we start with the group 
G = H; x H, where H; is cyclic of order 4 generated by an element h, and H, is 
cyclic of order 2 generated by an element h» of order 2. In this case we already know 
that G is a product of cyclic groups, but suppose we forget this and just follow the 
proof through. At the first step we choose an element g, in G of maximal order, 
so let us choose gı = (h,,1) which has order 4 in G. There are then two congru- 
ence classes of elements of G mod g}, namely the class consisting of the elements 
(nk nk) with k, = 0 and the class with k, = 1, so the group [G]; of congruence 
classes mod g, has order 2. Intuitively, taking congruence classes mod g, amounts 
just to ignoring the first coordinates of pairs (nk, h*?) since we are free to change 
this coordinate arbitrarily by multiplying (hn, hk?) by any element (ht, 1). Next we 
choose an element g, of maximal order in [G]; . For this we can choose g, = (hi, hy) 
for any k,. If we choose k} to be 1 or 3 then g will have order 4, which is larger 
than the maximal order of elements of [G], whichis 2. The next-to-last paragraph of 
the proof gives a procedure for rechoosing g> to have order equal to 2 rather than 4, 
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so in the present example this would amount to choosing k, to be 0 or 2 rather than 
1 or 3. Either choice kı = 0 or k, = 2 will work, but if we choose k, = 0 then the 
element g becomes simply (1, h) anda general product re g? becomes the general 
element (hi, hÈ) of H, x Hp. 


From the preceding theorem we can deduce a general fact: 


Corollary 7.12. Each element of a finite abelian group has order dividing the order 
of the group. 


An equivalent statement is that if a finite abelian group G has order n then 
g” = 1 for each g € G. This is because if g” = 1 then the order of g divides n and 
conversely. 


Proof: By the theorem a finite abelian group G is a product G, X --- x Gg of cyclic 
groups G;. If the order of G; is n; then the order of G is n = n,---n,. Each element 
gi in G; is a power of a generator of G; which has order n; so g;‘ = 1 and hence 
g; = 1. For any element g = (g,,---,g,) of G we then have g” = 1. m 


Fermat’s Little Theorem, which we encountered in the proof of quadratic reci- 
procity in Section 6.4, is a special case of this corollary, the case that the group is the 
group of congruence classes mod p of integers coprime to p, for p an odd prime. 
The group operation is multiplication of congruence classes, and integers coprime to 
p have multiplicative inverses mod p so one does indeed have a group. The order 
of the group is p — 1, so each element has order dividing p — 1 which implies that 
a’! =1 mod p for each integer a coprime to p, as Fermat’s Little Theorem asserts. 

The proof we gave for Fermat’s Little Theorem in Section 6.4 extends easily to give 
a simple proof of the corollary for any finite abelian group G. To see this, suppose 
G has order n, with the elements of G being g,,---,g,,. For an arbitrary element g 
in G the multiples gg,,---,gg9, are all distinct since if gg; = gg; then multiplying 
both sides of this equation by g`! gives Ji = Jj. Thus the sets {g),---,g,} and 
{9915°**,9Gy} are equal. Taking the product of all the elements in each of these two 
sets and using commutativity of the multiplication operation, we have g,-:-g, = 
9" 9, -Jn Which implies g” = 1. 

Fermat’s Little Theorem was generalized by Euler to replace the prime p by any 
number n. Here one takes the group of congruence classes mod n of numbers co- 
prime to n. As we know, these numbers have multiplicative inverses mod n so we 
again have a group. Its order is given by Euler’s function œ (n), the number of positive 
integers less than n and coprime to n. The statement is then that a?” =1 mod n 
for every a coprime to n. 

There are several different notations commonly used for the group of congruence 
classes mod n of integers coprime to n. We will write it as Z* with Z„ denoting the set 
of congruence classes of integers mod n and the star indicating that we are only taking 
congruence classes of integers coprime to n. One might wonder what the structure of 
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Z* is as a product of cyclic groups. The first step in understanding this is to apply the 
Chinese Remainder Theorem. As we saw in Section 2.3, if the prime factorization of 
nis p> Pe for distinct primes p;, then specifying the congruence class mod n 
of an integer coprime to n is equivalent to specifying its congruence class mod pi 
for each i, with the latter classes being coprime to pi (which is the same as being 
coprime to p;). This amounts to saying that Z* is the product Zy n X-X Zore > 


This gives a reduction to the case of a prime power p”. When p is an odd prime 
the group Zýr is cyclic, while Z% is cyclic when r < 2 but for larger r it is the 
product of two cyclic groups, one of order 2”~* and the other of order 2. These facts 
will not be needed in the rest of the book so we will not prove them but will instead 
just look at a few examples. Some cases when Z% is cyclic are shown in the following 
figures where the elements of Z;, label the vertices of a polygon and multiplication 
by a generator of Z* rotates the polygon, taking each vertex to the next vertex. 

5 4 5 7 11 13 
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Z; Z9 Zig Zii 

For example in the first figure the group Zž is cyclic of order 6 generated by 3 with 
the powers of 3 mod 7 being 3,2,6,4,5,1. Notice that when Z% is cyclic, any two 
opposite vertices are negatives of each other mod n, corresponding to the fact that —1 
is the only element of order 2 in Z* and multiplication by —1 rotates the polygon 180 
degrees. Note also that reflecting the polygon across its horizontal axis of symmetry 
sends each element of Z* to its multiplicative inverse in Z},. 

Some cases when Z% is not cyclic but is the product of a cyclic group of order 2 
with a cyclic group are shown in the next three figures. 


* x * 
Zie Z1 Z32 


Here the cyclic factor of order 2 is generated by —1 and multiplication by —1 takes 
each vertex of the inner polygon to the adjacent vertex of the outer polygon and vice 
versa. Multiplication by a generator of the other cyclic factor rotates the whole figure. 
Multiplicative inverses are again given by reflection across the horizontal axis. 
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Each of these diagrams is known as a Cayley graph for the group. The graph has a 
vertex for each element of the group, and two vertices are joined by an edge whenever 
one group element is obtained from another by multiplication by one of a chosen set 
of generators for the group. In the first four examples the group was cyclic so it had a 
single generator, while in the last three examples the group had two generators, one 
for each cyclic factor. 


The preceding Corollary 7.12 implies that a finite abelian group G of prime order 
p must be cyclic since any nonidentity element of G must have order p. This holds 
more generally if the order of G is a product of distinct primes since in a factorization 
of G as a product of cyclic groups these groups must all have coprime orders so their 
product will also be cyclic by repeated applications of Proposition 7.8. 

By Proposition 7.8, every cyclic group whose order is not a power of a prime can 
be expressed as a product of two cyclic groups of smaller order. Applying this fact 
repeatedly, every cyclic group is a product of cyclic groups of prime power order. 
Hence by Theorem 7.9 every finite abelian group is a product of cyclic groups of 
prime power order. A cyclic group of prime power order p* cannot be factored as a 
product since the factors would have orders p! for | < k so the elements of the factors 
would have orders dividing p*~!, hence the same would be true for all elements of 
the product, contradicting the fact that it is cyclic of order pk and so contains an 
element of order p*. 


Proposition 7.13. The factorization of a finite abelian group as a product of cyclic 
groups of prime power order is unique in the sense that any two such factorizations 
have the same number of factors of each order. 


For example, if we let C,, denote a cyclic group of order n, then the only two 
abelian groups of order 4 are C, and C, xC,. For order 8 there are three possibilities 
: Cg, C4 X Cy, and Cy x C, X Cp. For order 16 there are five possibilities: Cig, Cg X Co, 
C4 x C4, C4 X Co X Co, and Cy X Cp X Cp X Cp. These examples illustrate the general fact 
that the abelian groups of order a prime power p* correspond exactly to the different 
partitions of k as a sum of numbers from 1 to k. In the case of 2* = 16 these were 
the five partitions 4, 3+ 1, 2+2,2+1+1,and 1+1+1+1. (The order of the 
terms does not matter, so 2+ 1 + 1 is regarded as the same partition as 1+2+1 and 
1+1+2.) 

For groups whose order is a product of powers of different primes one just 
combines the various groups of each prime power independently. Thus for order 
144 = 9-16 there are ten possibilities, the products of the five groups of order 16 
listed above with either of the two groups Cy and C3 x C, of order 9. 

Thus we see that the only time that there is only one group of order n is when n 
is a product of distinct primes, so the group is a product of cyclic groups of distinct 
prime orders, making the whole group cyclic. 
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Proof of Proposition 7.13: The idea will be to characterize the number of cyclic fac- 
tors of each prime power order in an intrinsic way that does not depend on a particular 
choice of factorization. For a prime p dividing the order of a finite abelian group G 
let G(p) be the set of elements in G whose order is a power of p, including the iden- 
tity element 1 of order p°. Note that an element g has order a power of p exactly 
when g”” = 1 for some n. Given a factorization of G as a product G, x --- X Gg of 
cyclic groups of prime power order, an element g = (g,,-:--,g9,) of G has order a 
power of p exactly when each coordinate g; has order a power of p since if g”” =i] 
then g?” = 1 for each i and conversely if ge = 1 for each i then g”” = 1 for n 
the largest n;. For the factors G; whose order is a power of a prime different from p 
the only way to have g? "= 1 is when gi = 1. We can therefore regard G(p) as the 
product of the factors G; of order a power of p. This gives a characterization of the 
product of these factors G; that does not depend on the choice of the factorization 
of G. 


Thus the problem reduces to the case that G = G(p), i.e., G has order p” for 
some n, SO we assume this from now on. It remains to give an intrinsic characteriza- 
tion of the number of cyclic factors of order p” for each r. We will do this by counting 
the number of elements in the set G[q] of elements of G that are pth powers of 
elements of G. We have G = G[0] since all elements are p° th powers. Also, G[q] 
contains G[q + 1] for each q since ge" = (g? )”" , soa p+! st power is also a p^ th 
power. The identity element 1 belongs to G[q] for all q. 


A cyclic group of order p’ generated by an element g contains exactly p ele- 
r-1 pr! 2\p""} 3\p"} pap! 
ments that are p’ “st powers, namely g” , (g) , (o) ,---, (gP) =i 
since these are distinct p”~! st powers but after this there are just repetitions, with 
(gP) S g” i (gP*2)P" 
that are p” “th powers, the elements g`, (g), (g), ---, (g? )P = 1. 
For a product G, x --- xX G; an element (g,,---,g,) is a p*th power exactly when 


r-l 
= (g°)P and so on. Similarly there are p* elements 


each coordinate g; is a p* th power in Gi. 


For a given factorization G = G, X --- X G, as a product of cyclic groups of 
order a power of p let u(r) be the number of cyclic factors G; of order p” and 
let p” be the maximal order of elements of G. Thus G[m] consists of just the 
identity element of G since the p™ th power of each element of G is the identity. 
From the preceding paragraph we see that the number of elements in G[m — 1] is 
p"™ since each of the u(m) factors G; of order p™ has p elements that are p”! st 
powers and in the other factors only the identity is a p”! st power. Since G[m — 1] 
has pum) elements it follows that G[m — 1] determines u(m). Next, G[m — 2] 


contains (p?) 0™® pH(m-V) = p2H(m)+u(m=1) 


elements, with p°? elements coming from 
each factor of order p™ and p elements coming from each factor of order p™~!. Thus 
G[m-— 2] determines u(m-—1) since u(m) has already been determined. Continuing 


in the same way, we see that G[m — s] has p#0™+(s-Dulm-1)+-::+u(m-s+1) 


elements, 
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the last case being s = m with G[0] = G. It follows by induction on s that the 
subsets G[m — s] determine all the numbers u(r). Since the sets G[q] are defined 
independently of how G is factored as a product of cyclic groups, this finishes the 
proof. o 


The factorization of a finite abelian group as a product of cyclic groups of prime 
power order is the unique factorization with the largest number of factors since any 
other factorization with at least as many factors could þe factored further into a 
product with prime-power cyclic factors, contradicting the uniqueness statement in 
the preceding proposition. 

On the other hand there can be different factorizations into cyclic factors with 
the smallest number of factors. For example, if p and q are distinct primes then 
Cy 
to factor a group G as a product G, x --- X Gg of cyclic groups with the minimum 


2g2 X Cyg and C2, X Cpg are both the group Cp: x Cp X Caz x Cq. A natural way 


number of factors is via the following procedure. First factor G as a product of cyclic 
groups of prime power order. For each prime p; dividing the order of G let G(p;) be 
the product of the factors of G whose order is a power of p;. For each p; choose a 
factor of G(p;) of maximal order and let G, be the product of these chosen factors, 
so G, has one factor for each prime p;. Now repeat the process for the remaining 
factors of G to obtain G,, then once again for G} and so on until all the prime power 
cyclic factors of G have been exhausted. In the end the number of factors G; is equal 
to the maximum number of factors in all of the groups G(p;). 

In the preceding example of a group of order pa this would yield the factor- 
ization C,292 X Cpa. In general this procedure yields a product C,, X +--+ X C,, with 
each n; divisible by n;,,. This is the same factorization of G as the one produced in 
the proof of Theorem 7.9 since it is uniquely determined by the condition that each 
n; is divisible by n;,,. 


In the rest of this section we will give two propositions about finite abelian groups 
that will be applied to class groups in the next two sections. Both propositions have 
to do with the operation of squaring elements of a group. For the first proposition 
we consider elements of a finite abelian group G whose square is the identity. These 
are the elements of order 1 or 2. These elements form a subgroup of G, that is, a 
subset which is a group in its own right. For a subset H of a group G to be a subgroup 
amounts to H satisfying three properties: 


(1) The product of two elements of H is again in H, so within H there is a multipli- 
cation operation defined, the same multiplication as in G. The multiplication in 
H is automatically associative since multiplication in G is associative. 


(2) H contains the identity element of G. 


(3) The inverse of each element of H isin H. 
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These properties hold when H consists of the elements of order 1 or 2 in a finite 
abelian group G since property (1) means that if g? = 1 and g = 1 for elements g, 
and g» of G then (g,g>)* = 1, whichis true since (g1 g2)? = gigs when G is abelian, 
while property (2) holds since the identity element of G has order 1 and (3) holds 
since g = g`! if g* =1. 

Proposition 7.14. In a finite abelian group G the elements whose order is 1 or 2 
form a subgroup of order 2° where e is the number of factors of even order in any 
factorization of G as a product of cyclic groups. This subgroup is a product of e 
cyclic groups of order 2, and the order of G is a multiple of 2°. 


In general, when a finite abelian group G is factored as a product of cyclic groups 
of prime power order, the number of factors of order a power of the prime p is called 
the p -rank of G. The number e in the proposition is thus the 2-rank of G. The 
proposition easily generalizes to the statement that the number of elements of G of 
order 1 or p is p” where r is the p-rank of G. 


Proof: Let G = G, x --- x G; be a factorization of G as a product of cyclic groups. 
An element (g1, --,gg) of the product has order 1 or 2 exactly when each g; has 
order 1 or 2. A cyclic group C>,, of even order generated by an element g has just 
one element of order 2, the element g”, since a power ge with 0 < k < n has ea +1 
and the inverses of these elements are the powers g“ with n < k < 2n so these too 
do not have order 2. A cyclic group of odd order has no elements of order 2 since the 
order of an element always divides the order of the group. Thus if e is the number of 
factors G; of even order, there are e coordinates g; of (g1, '**, gg) where we have 
a choice of two elements of G; of order 1 or 2 and in the other coordinates we must 
have g; = 1. The elements of order 1 or 2 thus form a product of e cyclic groups of 
order 2. The last statement of the proposition is then obvious. o 


For any abelian group G we can form another group denoted G/G* whose ele- 
ments are congruence classes of elements of G mod squares, so g, = go if go = Ch 
for some g € G. This is analogous to taking congruence classes of integers mod 2 
except now the group operation is multiplication rather than addition. The multipli- 
cation in G/G* comes from multiplication in G, so if we denote the congruence class 
of g € G by [g] then [g,]|g.] is defined to be [g,g]. This is unambiguous since 
if g, = 9; and gp = gs, so g; = g,hi and gs = g h5 for some hı, h, € G, then 
I1G2 = g; g since gigh = 9,g2(h,h>)*. The identity element of G/G* is [1] where 
1 is the identity of G, and [g]~! = [g~']. Since associativity in G/G* follows auto- 


matically from associativity in G, we conclude that G/G° is a group, which is abelian 
since G is abelian. 


Proposition 7.15. For a finite abelian group G factored as a product G} x --- xX Gg 
of cyclic groups G; the group G/G? is a product of cyclic groups of order 2 with 
one factor for each factor G, of even order. 
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Proof: For G = G} x --- x G the square of an element (g,,---,9,) iS (95,°**sIp): 
so the group G/G° is the product of the groups G;/ G? . Thus the proposition reduces 
to the special case that G is a cyclic group. If G is cyclic of even order 2n with genera- 
tor g then the squares in G are the even powers g°, g4,- - - ,g?™”, g7 =g°,g°""* = 
g,- -- which are all congruent to 1. The odd powers g,g’,:--,g°" ',g°""! = 
g,g-"*> = g?,--- are all congruent to each other but not to any even power of g 
so G/G° is cyclic of order 2. If G is cyclic of odd order 2n + 1 then the squares 


O°. Gg = g, g?°™”+4 = g°,--+ formall of G so G/G* has order 1. o 


Exercises 


1. Show the converse of Proposition 7.8: If a product G, x G, of finite abelian groups 
is cyclic then G, and G, are cyclic of coprime orders. 


2. Show that ifaprime p divides the order of a finite abelian group G then G contains 
an element of order p. For which nonprimes is this also true? 


3. For each abelian group of order 4, 8, or 16 determine the number of elements of 
each possible order. 


4. Determine the maximum order of elements of a finite abelian group G in terms of 
the factorization of G as a product of cyclic groups of prime power order, and show 
that the orders of elements of G are exactly all the divisors of this maximal order. 


5. (a) State and prove the analogue of Proposition 7.14 with 2 replaced by an odd 
prime p. 
(b) Do the same for Proposition 7.15. 


6. This problem concerns the question of when the group Z% of congruence classes 
mod n of integers coprime to n is cyclic. 

(a) Show that Z> and Z% are cyclic but Zš is not cyclic and deduce that Z3, is also 
not cyclic when r > 3. 

(b) Show that if Z* is cyclic then n = 2, 4, p”, or 2p” for some odd prime p. Hint: 
Z% has even order if n > 2. 

(c) The group Zr is known to be cyclic. Show that this implies that Zžpr is cyclic. 


7. Describe each of the following groups Z% as a product of cyclic groups and draw 
a Cayley graph: Zip, Zj3, Zis, Z54, and Z§o. 
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7.4 Symmetry and the Class Group 


We have defined the symmetric class number h for discriminant A to be the 
number of equivalence classes of primitive forms of discriminant A whose topographs 
have mirror symmetry. Thus hå is the number of elements in the class group CG(A) 
whose order is 1 or 2 since mirror symmetric forms correspond to elements of CG(A) 
satisfying Q = Qt, which is the same as saying Q° = 1. (For symmetric forms there 
is no distinction between equivalence and proper equivalence.) As we saw in the 
discussion before Proposition 7.14, these elements form a subgroup of CG(A) which 
could be called the symmetric class group with the notation SCG(A). Its order is 

4, and it is a product of cyclic groups of order 2 since each element has order 1 
or 2. 
From Proposition 7.14 we can immediately deduce the following result: 


Proposition 7.16. (a) The symmetric class number hå is equal to 2" where r is 
the 2-rank of CG(A), the number of cyclic factors of CG(A) of order a power of 2 
when CG(A) is expressed as a product of cyclic groups of prime-power order. 

(b) The ordinary class number h, is always a multiple of hx, with h, = hi exactly 
when CG(A) is a product of cyclic groups of order 2, and hx = 1 exactly when ha 
is odd. Oo 


Applying Theorem 5.9 which computed hå in terms of the prime factorization 
of A we conclude: 


Corollary 7.17. If the number of distinct prime divisors of A is k then the 2-rank 
of CG(A) is k—1 except when A = 4(4m +1) when the 2-rank is k — 2, and when 
A = 32m when the 2-rank is k. In particular the 2-rank is k — 1 when A is a 
fundamental discriminant. Oo 


From this corollary we can deduce another: 
Corollary 7.18. If |A| is prime then the class number h, is odd. o 


We know that CG(A) is cyclic if the class number is prime or a product of distinct 
primes, but there are other cases when the structure of CG(A) as a product of cyclic 
groups is completely determined if one knows the class number as well as the prime 
factorization of A, using the fact that the latter determines the 2-rank of CG(A) 
as in Corollary 7.17. For example if the class number is 4 then CG(A) is either C, 
or Cy x C, and these two cases are distinguished by their 2-ranks. We saw this 
distinction between C, and C, x C, for the fundamental discriminants —56 and —84 
both of which have class number 4, but —56 has two distinct prime divisors so its 
class group is C, while —84 has three distinct prime divisors so its class group is 
Co X Co. 
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A similar thing works for class number 8 where the group is either Cg, C4 XC), or 
Cy X Cy X C, , with different 2-ranks. On the other hand, for class number 16 there is 
an ambiguity between Cg x C» and C, x C4. The first negative discriminant with class 
number 16 is A = —399 = —3-7-19, a fundamental discriminant. Since there are 
three distinct prime factors of A the 2-rank of CG(A) is 2 so the ambiguity between 
Cg x C, and C, x Cy, arises here. It is easy to compute that there are ten reduced forms 
of discriminant —399: 


[1,1,100] [2,1,50] [4,1,25] [5,1,20] [10,1,10] 
[3,3,34] [6,3,17] [7,7,16] [8,7,14] [10,9,12] 


Labeling these as Q,,:--,Qs in the first row and Q¢,:--,Qj9 in the second row, 
we see that there are four forms with mirror symmetry, Q1, Q5, Qg, Qg, the forms 
with two of their coefficients equal. This is in agreement with the 2-rank being 2. 
The six without symmetry count double in the class number which is therefore 16. 
To determine whether the class group is Cg xX C, or C, x C, it suffices to look for 
elements of order greater than 4. This happens to be very easy in this case if we look 
at which forms represent powers of 2. In the list above we see that Q, represents 2, 
Q; represents 4, Qg represents 8, and Qg represents 16. Since powers of primes 
not dividing the discriminant are always represented by unique equivalence classes of 
forms, it follows that Q5 = Q}', Q3 = Q§', and Q3 = Qg, with no sign ambiguity in 
the last case since Qg has mirror symmetry. In particular we see that Q, must have 
order greater than 4, so CG(A) is not C, x Cy and hence it must be Cg x C3. 

The order of Q, is 8 since there are no elements of order 16 in Cg x C>. (This 
also follows from the fact that Qs has mirror symmetry hence must have order 2.) 
As in the proof of Theorem 7.9 we can choose Q, as a generator of the Cg factor of 
CG(A), and a generator of the C, factor can be chosen to be either Q; or Qg, the two 
forms with mirror symmetry that are not a power of Q». Additional work would be 
needed to compute the remaining products Q;Q; such as whether Q5 is Q3 or Q3 L 
However some products can be determined without calculation, for example the fact 
that the product of any two of the symmetric forms Q;, Qs, Qg equals the third since 
the product of two elements of order 2 must have order 1 or 2, but for example 
Q;Qg¢, cannot be the identity element Q, nor can it be Q; or Qę so it must be Qg. 
Thus the elements Q,, Q;, Qę, and Qg form a subgroup C, x C,. This is just the 
symmetric class group SCG(A). 

A similar but even simpler sort of ambiguity occurs for class numbers p° with 
p an odd prime, where the choice is between the groups Cp: and C, x Cp. The 
first example of this sort among negative discriminants occurs when A = —199. The 
reduced forms are Q; = [1,1,50], Qo = [2,1,25], Q; = [5,1,10], Q; = [4,3,13], 
and Q; = [7,5,8]. Only Q, has mirror symmetry so the other four forms count twice 
in the class number which is therefore 9. To decide whether CG(A) is Cy or C3 x C3 
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we observe that Q, represents 2, Q4 represents 2°, and Q; represents 23, so Q> 
must have order greater than 3 in CG(A). Since the order of Q, must divide the 
order of CG(A) we see that Q, has order 9 and so CG(A) is Cg rather than C3 x C3. 


The order of the class group can be made arbitrarily large by taking A to have a 
large number of distinct prime factors, using a product of distinct odd primes if one 
wants a fundamental discriminant. It is also possible for individual elements of the 
class group to have large order: 


Proposition 7.19. For arbitrary integers a> 1 and n > 1 the form [a,1,a"™'] has 
order n in CG(A) for A = 1 — 4a”. 


Proof: The form [a,1,a”"7}] is concordant to itself if n > 1 and we can use this fact 
to compute its powers inductively as in the proof of Theorem 7.7, with the result that 
[a,l,a™']* = [ak,1,a"*]. When k = n the latter form is [a”,1,1] which repre- 
sents 1 so it is the identity element in the class group. Thus the order of [a, 1, any 
is a divisor of n. The discriminant 1 — 4a” is negative and the forms [a*,1,a”~*] 
are reduced if k < n — k, or in other words if k < "/. None of these reduced forms 
is the principal form if a > 1 so none is the identity in CG(A). Thus the order of 


ae 


la,l,a is greater than "/ so it must be n. o 


In general it is a hard question to determine which finite abelian groups occur as 
class groups. An interesting special case is to determine the values of n such that 
the product of n cyclic groups of order 2 is a class group CG(A) for some A. By 
Proposition 7.16 this is equivalent to having h, = hÀ, and we have mentioned that 
there is a list, probably complete, of 101 negative discriminants A with this property. 
In these 101 cases the number of C, factors of CG(A) ranges from 0 to 4, so the 
class number is 1, 2, 4, 8, or 16. Thus it appears that a product of five or more 
copies of C, cannot occur as a class group CG(A) with A < 0. For A > 0 less seems 
to be known. 

Here is a table listing the smallest discriminants having class group a given abelian 
group of order up to 12: 


A<0O -3 -15 -23 -39 —84 -47 -87 -71 
A>0 5 12 148 136 60 401 316 577 


-95 —224 —420 -199 -4027 -119 -167 -279 -231 
505 396 480 1129 32009 817 1297 1345 940 


As one can see, for positive discriminants one usually needs to go farther than for 
negative discriminants to realize a given group. 
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While positive discriminants are more difficult both computationally and theoret- 
ically, they have an extra piece of structure that adds to their interest, the operation 
that sends aform Q toits negative —Q. This gives a well-defined operation on CG(A) 
since if two forms Q, and Q, are properly equivalent then so are —Q, and —Q>, be- 
cause an orientation-preserving linear fractional transformation taking the topograph 
of Q; to the topograph of Q, takes the topograph of —Q, to the topograph of -Q2. 
Also, if Q is primitive then obviously so is —Q. 

In CG(A) the operation sending Q to —Q is generally different from the opera- 
tion which sends Q to its mirror image form Q~' in CG(A). For example when A = 12 
the group CG(A) is cyclic of order 2 consisting of the principal form Q = x° - 3y? 
and its negative -Q = -x° + 3y? which is equivalent to 3x° — y’. Thus Q and -Q 
are distinct elements of CG(A), but Q = Q7! and -Q = —Q™! since Q and -Q 
have mirror symmetry. Note that there is never any ambiguity about whether —Q7! 
is —(Q7'), the negative of the mirror image of Q, or (—Q)71, the mirror image of the 
negative of Q, since these are obviously the same. 


Proposition 7.20. Inverses and negatives are related to symmetries and skew sym- 
metries in the following ways: 
(a) Q = Q7! in CG(A) if and only if the topograph of Q has a mirror symmetry. 
(b) Q = -Q in CG(A) ifand only if the topograph of Q has a 180 degree rotational 
skew symmetry. 
(© Q = -Q7! in CG(A) if and only if the topograph of Q has a glide reflection 
skew symmetry. 


Proof: We have already seen that (a) holds. Statements (b) and (c) apply only to hy- 
perbolic forms, in which case we can focus on what is happening along the separator 
lines in their topographs. We take separator lines to be drawn in the usual way as 
horizontal lines with positive values above and negative values below. We can assume 
that the edges leading off the separator line occur at unit intervals. 

For (b), the separator line for the negative of aform Q is obtained by first changing 
the sign of all the labels along the separator line for Q and then rotating the plane by 
180 degrees about some point on the separator line to bring the positive labels back 
above the separator line. If Q is properly equivalent to -Q this means that these 
two operations of changing signs and rotating produce the same separator line we 
started with, up to horizontal translation. Thus the composition of a rotation and a 
translation gives a skew symmetry of the separator line of Q. The two ends of the 
line are interchanged by this skew symmetry so it must fix some point on the line, 
as we Saw in the discussion of symmetries of hyperbolic forms in Section 5.4. Hence 
the skew symmetry must be a rotation about this point of the separator line. Thus if 
Q = -Q in CG(A), the topograph of Q has a 180 degree rotational skew symmetry. 
The converse is obviously true as well. 
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For (c), we can transform the separator line of a form Q to the separator line 
of —Q7' by first changing the signs of the labels and rotating by 180 degrees to 
get the separator line for —Q, then reflecting across a vertical line to convert this to 
the separator line for -Q~!. The composition of the rotation and the reflection is a 
glide reflection along the separator line. Thus the separator line for Q is transformed 
into the separator line for —Q7! by a glide reflection and changing the sign of the 
labels. Hence if Q is properly equivalent to —Q~!, the separator line for Q has 
a skew symmetry obtained by combining a glide reflection with a translation. This 
combination is again a glide reflection. o 


We can picture the relationships between inverses and Q oO” 
negatives by the diagram at the right which can be viewed as 
a picture of a regular tetrahedron. The tetrahedron has three 
180 degree rotational symmetries about the three axes pass- 
ing through midpoints of opposite edges of the tetrahedron. D 
One of these rotations sends each form to its inverse, another -Q -Q 
sends each form to its negative, and the third sends each form to the negative of its 
inverse. These rotational symmetries of the tetrahedron are related to symmetries 
and skew symmetries of forms in the following ways: 


= If Q has mirror symmetry then so does —Q so the top two forms are equal in 
CG(A) and so are the bottom two. The first of the three rotational symmetries 
of the tetrahedron realizes these equalities in CG(A). 

- If Q has a rotational skew symmetry then so does Q7! so the two forms on the 
left are equal in CG(A) and so are the two on the right. These equalities are 
realized by the second rotation of the tetrahedron. 

= If Q has a glide reflection skew symmetry then so does —Q so the two forms in 
each diagonal pair are equal in CG(A), and the third rotation of the tetrahedron 
gives these equalities. 


When Q has two of the three types of symmetries and skew symmetries, it has the 
third type as well, so all four forms are equal in CG(A). In this case we will say 
that Q is fully symmetric. For example the principal form always has mirror sym- 
metry and represents 1 so it is fully symmetric exactly when it represents —1 since 
Proposition 6.16 says this is equivalent to its having a skew symmetry. 

Now let us see how negation of forms relates to multiplication in CG(A). One 
might guess that (-Q,)Q» = —(Q,Q,) as with numbers, but this turns out to be not 
quite right as the following lemma shows: 


Lemma 7.21. In CG(A) the formula (—Q,)Q» = —(Q,Q3') holds forall Q, and Q,. 
In particular, when Qı = Q, we have (—Q,)Q, = -Qo where Qo is the principal 
form. 
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Proof: The forms Q, and Q, are properly equivalent to a pair of concordant forms 
[a,,b,a,c]| and [a y,b,a,c]. The form [—a,,—b,—a,c] is then concordant to the 
form [a>,—b, (—a,)(—c)] = [a2, —b, a,c]. Taking the product of this pair of concor- 
dant forms gives [—a,,—b,—-a»c]|a»,—-b,a,c] = [-a,a»,—b,-c]. This says that 
(-Q,)(Q5') = —(Q,Q>). Replacing Q, by Q3! then gives the claimed formula 
(—Q1)Q2 = —(Q1Q;'). z 


Proposition 7.22. If one element of CG(A) has a glide reflection skew symmetry 
then so do all elements of CG(A). This occurs exactly for those discriminants for 
which the principal form represents —1. 


Proof: Suppose that Q is a form with a glide reflection skew symmetry, so Q = = 
or equivalently -Q = Q7!. Then if Qo is the principal form, we have Qj) = Q'Q= 
(—Q)Q and this equals -Qo by the previous lemma. Thus Qj) = -Q9 if a single form 
has a glide reflection skew symmetry. Once one has Qo = —Qg, then for arbitrary Q 
the formula (-Q)Q = -Qoy says that Q is the inverse of -Q, so Q = —Q-! which 
means that Q has a glide reflection skew symmetry. This proves the first statement 
of the proposition. The second statement then follows since the principal form has a 
glide reflection skew symmetry exactly when it represents —1. o 


Corollary 7.23. If the class number h, is odd then all forms in CG(A) have a 
glide reflection skew symmetry but only the principal form has a rotational skew 
symmetry. 


Proof: The principal form Qo has mirror symmetry and therefore so does —Q,. Thus 
(—Qy)* = Qo. If CG(A) has odd order then it has no elements of order 2 so we must 
have -Qo = Qo. Thus Qo has a rotational skew symmetry so it must also have a glide 
reflection skew symmetry. By the preceding proposition all forms in CG(A) then have 
a glide reflection skew symmetry. Any form which had a rotational skew symmetry 
would therefore also have a mirror symmetry and hence be of order 1 or 2 in CG(A), 
so it would have to be Qo. Oo 


One might ask whether the “one implies all” property in Proposition 7.22 also 
holds for the other two types of symmetries and skew symmetries. For mirror sym- 
metries the only time all elements of CG(A) have mirror symmetry is when CG(A) is 
a product of cyclic groups of order 2, arather rare occurrence that we have discussed 
before. For rotational skew symmetries it can happen that some forms have rotational 
skew symmetry while others do not. We just saw that when CG(A) has odd order only 
the principal form has rotational skew symmetry. An example where another form has 
rotational skew symmetry but the principal form does not is A = 136. Here it is not 
hard to compute that there are three equivalence classes of forms: Qọ = [1, 0, —34], 
-Qo = [-1,0,34], and Q; = [3,2,—11]. Here are the topographs of Qo and Q,: 
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Since Qo and -Qo have mirror symmetry while Q, does not, the class number is 4. 
The group CG(A) must be C, rather than C, xC, since it contains a form Q, without 
mirror symmetry, so this form has order 4 rather than 2. Thus Q? has order 2 so 
it must be the form —Q,, as is confirmed by the fact that Q, represents 3 and -Qo 
represents 9. The topographs show that only Q, and OF have a rotational skew 
symmetry. 

When do all primitive forms of discriminant A have a rotational skew symmetry? 
If this happens then in particular the principal form has a rotational skew symmetry, 
as well as a mirror symmetry, so it also has a glide reflection skew symmetry. The 
previous proposition then says that all primitive forms have a glide reflection skew 
symmetry, in addition to the assumed rotational skew symmetry, so they have mirror 
symmetry as well. Thus the class group is a product of cyclic groups of order 2 
and the principal form represents —1. Conversely, these two conditions imply that 
all principal forms have mirror symmetry and glide reflection skew symmetry, hence 
also rotational skew symmetry. 


Another question one could ask is which discriminants have at least one primitive 
form with rotational skew symmetry. This turns out to have a very pleasing answer. 
As we observed near the end of Section 5.4, the pivot points of rotational skew sym- 
metries lie at the midpoints of edges of the separator line where the labels of the 
adjacent regions in the topograph are a and —a. If the edge itself is labeled b then 
the associated form is [a, b,—a], and all such forms occur this way at pivot points of 
rotational skew symmetries. The discriminant of the form [a,b,—a] is b? + 4a so 
we are looking for solutions of x° + 4y* = A. For [a,b,—a] to be primitive means 
that the pair (a,b) is primitive, so the question reduces just to finding the numbers 
represented by the form x7 + Ay?, excluding squares since we want the resulting 
forms [a,b,—a] to be hyperbolic. (Squares correspond to 0-hyperbolic forms with 
rotational skew symmetry.) Here is a portion of the topograph of x? + 4y* showing 
also the labels +; = b that determine the associated forms [a, b, —a]: 
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The form x? + 4y? has discriminant —16 with class number 1. From Theorems 6.11 
and 7.7 we can deduce that the numbers represented by x? + 4y? are the numbers 
2'™p,--++p, where m is 0, 2, or 3 and each p; is a prime congruent to 1 mod 4. 
This tells us which discriminants have at least one primitive form with rotational skew 
symmetry. 

Amore refined question is how many different elements of CG(A) have rotational 
skew symmetries. Solutions of b? + 4a” = A come in groups of four obtained by 
varying the signs of a and b. If we restrict attention just to the solutions with a 
positive, the primitive solutions (a, b) correspond exactly to regions in the topograph 
of x? +4y? labeled A, and these regions come in pairs, one in the upper half of the 
topograph with b > 0 and one in the lower half with b < 0. The sign of the label b 
on an edge of a topograph with a pivot point can be specified by orienting all edges 
of the separator line so that the regions on the left of the separator line have positive 
labels. Taking the mirror image topograph then corresponds to changing the sign of b. 
This might or might not give the same element of CG(A) depending on whether the 
topograph has mirror symmetry. 

The topograph of a form with rotational skew symmetry has two pivot points on 
the separator line in each period. Thus the number of proper equivalence classes of 
primitive forms of discriminant A with rotational skew symmetry is half the number 
of regions labeled A in the topograph of x? + 4y°, and is therefore equal to the 
number of such regions in the upper half of the topograph. In other words the number 
of elements of CG(A) with rotational skew symmetry equals the number of times that 
A appears in the upper half of the topograph of x° + 4y°. For example, a prime can 
appear only once in the upper half of the topograph by Proposition 6.16 so prime 
discriminants have only one element of CG(A) with rotational skew symmetry, and 
this element must have mirror symmetry. 

In general the number of rotationally skew symmetric forms in CG(A) can be 
computed from the prime factorization of A using methods from the next chapter. 
The result is that if A = 2p{! -- -pi for distinct primes p; = 1 mod 4 with each 
e; > 0 then the number of forms in CG(A) with rotational skew symmetry is get 
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when m = 0 or 2, and 2* when m = 3. 


Exercises 


1. For discriminant A = —95 first compute the class number by finding all the reduced 
forms, then determine the structure of the class group in two different ways, first by 
applying Corollary 7.17 and then by seeing which forms represent powers of 2 up to 
or 

2. For discriminant A = —164 determine the structure of the class group and find the 
orders of all its elements. 


3. Do the same for discriminant A = —224. 


4. For discriminant A = 148 determine the class group and also the symmetries and 
skew-symmetries of the forms of that discriminant. 


5. Do the same for A = 145. 


6. (a) Show that the form [2,1,m] has order at least n in its class group if 2m > 2”. 
(b) Show that the discriminant 1 — 8m in part (a) can be chosen to be a fundamental 
discriminant. 

(c) Do the analogues of (a) and (b) using the form [3,2, m] of even discriminant. 

7. Show that if a form Q of discriminant A represents a prime p coprime to A then 
p* is represented by Q if and only if the order of Q in the class group divides k — 1 
ork+1. 


7.) Genus and Rational Equivalence 


The various genera of forms of discriminant A are determined by the charac- 
ters x associated to primes p dividing A, where x assigns a value x(n) = +1 to 
each integer n not divisible by p. Since each character has a constant value on all 
numbers in a topograph not divisible by p, we can regard characters as functions 
from CG(A) to {+1}. A key property of characters is that they are multiplicative, so 
xmn) = X(N,)X(N2). This implies that characters are also multiplicative as func- 
tions on CG(A), meaning that x(Q,Q>) = x(Q,)x(Q>) for forms Q; and Q, defining 
elements of CG(A). This is because the topographs of Q, and Q, contain numbers 
n, and n, not divisible by p and coprime to each other by Proposition 6.25, and then 
the topograph of Q,Q, contains n,n». Thus x(Q,Q>) = x(n; No) = x(n )x (n) = 
X(Q1)X(Q>). 

Since the values of characters are +1 this implies that x(Q*) = +1 for each 
primitive form Q. Therefore x(Q, Q*) = x(Q1)x(Q?) = x(Q,) forall Q; and Q. This 
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means that characters define functions on the group CG(A)/CG(A)* of congruence 
classes of forms modulo squares. Let G(A) be the set of genera in discriminant A. 
Since forms that are congruent modulo squares have the same genus, there is a well- 
defined function @ from CG(A)/CG(A)* to G(A) sending each congruence class of 
forms to the genus of these forms. 


Proposition 7.24. The function @ from CG(A)/CG(A)* to G(A) is a one-to-one 
correspondence. Thus two primitive forms Q, and Q, of discriminant A belong 
to the same genus if and only if when we regard them as elements of CG(A) we 
have Q, = QQ? for some primitive form Q of discriminant A. 


We will give two proofs of this basic result. The first proof relies on Dirichlet’s 
Theorem on primes in arithmetic progressions which we have not proved in this book, 
and which we have previously used only at the end of Section 6.3 in the proofs of 
Theorem 6.26 and Corollaries 6.27 and 6.28. The second proof will use only results 
proved in this book, notably Legendre’s Theorem on solutions of ax? + by? = cz? 
from Section 2.3, but this proof has the disadvantage of applying only for fundamental 
discriminants. 


First proof: By the definition of genus, every genus contains at least one form, so ® 
is onto. Since a function between two finite sets with the same number of elements is 
one-to-one if and only if it is onto, it will suffice to show that CG(A)/CG(A)? and G(A) 
have the same number of elements. By Corollary 6.27 the number of genera is equal 
to the number of elements of CG(A) corresponding to forms with mirror symmetry, 
or in other words the elements of CG(A) of order 1 or 2. By Propositions 7.14 and 
7.15 this equals the number of elements of CG(A)/CG(A)*. o 


For the second proof of Proposition 7.24 the main step will be the following: 


Proposition 7.25. If a primitive form of nonsquare discriminant belongs to the 
genus of the principal form then it represents a nonzero square. 


Proof: A primitive form ax*+bxy+cy? of discriminant A represents some positive 
number coprime to 2A so after a change of variables we may assume a is this number. 
Thus a is positive, odd, and coprime to A. If the form belongs to the genus of the 
principal form, we wish to find an integer solution of ax*+bxy+cy? = z? with z + 0. 
This is equivalent to finding a rational solution with z + O since a rational solution 
yields an integer solution by multiplying x, y, and z by a common denominator. 
Having an integer solution (x, y, z) means that the form ax*+bxy+cy? represents 
a square since any common divisor of x and y will divide z and can then be canceled 
from the equation. 
After multiplying the equation ax? + bxy + cy? = z? by 4a it becomes: 


4alax? + bxy + cy*) = (2ax +by)* + (4ac — b*) y? = 4az? 
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If we let w = 2ax +by this can be written as w° —Ay* = 4az’ or Ay? +4az* = w?, 
and a rational solution of this equation will give a rational solution of the original 
equation ax? + bxy +cy? = z? with x = Y-49/,. If we write A and 4a as squares 
times squarefree numbers A’ and a’ then the equation Ay* + 4az* = w° can be 
replaced by A’y? + a'z? = w° by absorbing the square factors of A and 4a into y? 
and z°. Since A and a were coprime, so are A’ and a’. 

We would like to apply Legendre’s Theorem to the equation A’ y? +a z? = w°. 
The sign condition in the theorem is satisfied since a is positive, hence so is a’. The 
remaining conditions are that A’ is a square mod a’ and a’ is a square mod A’. For 
the first of these two conditions we know that A is a square mod a since A = b*—4ac, 
hence A is a square mod each prime dividing a. From the multiplicative property of 
Legendre symbols it follows that A’ is also a square mod these primes and in particular 
a square mod each prime dividing a’. These primes are odd since a is odd, so A’ is 
a square mod a’ by Lemma 6.4 since a’ is a product of distinct primes. 

Now consider the condition that a’ is a square mod A’. This is equivalent to a’ 
being a square mod each prime p dividing A’ since A’ is squarefree. For p = 2 this 
holds automatically. For odd p this means the Legendre symbols (5) = (4) have 
value +1, which they do if the form ax* + bxy + cy? is in the genus of the principal 
form since this form represents a. 

Thus Legendre’s Theorem applies and there is a nontrivial integer solution of 
A’y*+a'z* = w°. This must have z nonzero, otherwise A’ would have to be 1 since 
it is squarefree, and this would make A a square, contrary to hypothesis. o 


Corollary 7.26. For fundamental discriminants each form in the genus of the prin- 
cipal form is the square of another form. 

Proof: If a form Q is in the genus of the principal form the proposition says it rep- 
resents a nonzero square n°. Let the prime factorization of n* be p?” e. pg” for 
distinct primes p;. Theorem 7.7 then says that Q has a corresponding factorization 


O20 “Qe” in CG(A). Hence Q is the square of QÑ --- QK. = 


This corollary could also be deduced from Proposition 7.24 without using the 
hypothesis of fundamental discriminant. However the proof we gave for the corollary 
cannot yield this more general statement since it is not always true that a primitive 
form that represents a square must be the square of another form. For example for 
A = —32 the form 3x° + 2xy + 3y° represents 4 when (x,y) = (1,-1) but this 
form is not a square since the character xg is defined for A = —32 and has the value 
—1 on this form. The proof of the corollary does apply if the square represented is 
coprime to the conductor. 


Second proof of Proposition 7.24, just for fundamental discriminants: Suppose Q, 
and Q, have the same genus. This means that all characters have the same values 
for Q; and Qs, so all characters have the value +1 on Q,Q5'. This form therefore 
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lies in the genus of the principal form, so by the preceding corollary Q,Q> l jisa 
square Q* in CG(A). Thus Q= QQ? and so Q, and Q, give the same element of 
CG(A)/CG(A)*, which means that ® is one-to-one. o 


Let us illustrate the correspondence between elements of CG(A)/CG(A)? and 
genera by the example of discriminant A = —104. We have already looked at this 
example in some detail earlier in the chapter where we saw that CG(A) is a cyclic 
group of order 6 generated by the form Q, = [5,4,6]. We have (=) = (=) = 


p Pp 
(=) (5) =) = (>) (5) (4). The product >) (5) is +1 for p = 1,3 mod 8 and 
—1 for p = 5,7 mod 8 so this is the character we called x in Section 6.3, while (4) is 


X13, With the value +1 for p = 1,3,4,9,10,12 mod 13 and —1 for p = 2,5,6,7,8,11 
mod 13. These are the two characters for A = —104. Evaluating these characters on 
numbers not divisible by 2 or 13 in the topographs shown in Section 7.1, we see that 
Q, and Q3! belong to one genus where the character values are +1, +1, while Q, and 
Qi! make up the other genus with character values —1, —1. Expressing the forms as 
powers of the generator Q, we see that the even powers Qi =Q3 $ Qj = Q3, and 
Q = Q, form one genus and the odd powers Q,, Q} = Q,, and Q3 = Og form the 
other genus. Thus two forms belong to the same genus exactly when one is a square 
times the other since the squares are the even powers of Q,. 


From Proposition 7.24 we can deduce the following interesting consequence of 
having a group structure in CG(A): 


Corollary 7.27. Each genus of forms of a given discriminant contains the same 
number of proper equivalence classes of forms. 


Proof: Let Q,,---,Q, be the distinct elements of CG(A) in the genus of the principal 
form. By Proposition 7.24 these are exactly the elements of CG(A) that are squares. 
The genus of an arbitrary element Q of CG(A) then consists of QQ,,:--,QQ;, since 
these are all the elements of CG(A) obtained by multiplying Q by squares. These 
multiples of Q are all distinct since if QQ; = QQ, then after multiplying by Q7! we 
have Q; = Q; so i = j. Thus each genus consists of k elements of CG(A). o 


For a fixed discriminant A the class number is the product of the number of 
genera times the number of classes in each genus. There are two extreme situations 
that can occur when one or the other of these two factors is 1: 


(1) The number of genera is 1, so the primitive forms of discriminant A all have the 
same genus. Equivalent ways of stating this condition are: 


= The only primitive forms with mirror symmetry are the forms equivalent to 
the principal form. 

= (CG(A) contains no elements of order 2. 

= CG(A) contains no elements of even order. 

= The class number is odd. 
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(2) Each genus consists of a single equivalence class of forms. Again there are equiv- 
alent statements: 


«= The number of genera equals the class number. 

« Every form has mirror symmetry. 

= Every element of CG(A) has order 2. 

= (CG(A) is a product of cyclic groups of order 2. 

= The representation problem of determining which numbers are represented 
by each primitive form has a solution just in terms of congruence classes 
modulo the discriminant. 


Discriminants where (1) or (2) occurs are rather rare. For (1), Corollary 5.10 says 
exactly when this happens in terms of the prime factorization of A. For (2) there is 
no such simple characterization. 


The relationships between the class group, genus, and symmetry can be expressed 
concisely in a sequence of groups and functions between them: 


Ch 


SCEA) SS COA) — S60) = Ws Se 


Here SCG(A) is the symmetric class group, the subgroup of CG(A) consisting of 
symmetric forms, and the function SCG(A) — CG(A) is just the inclusion of this 
subgroup into CG(A). The function Sq is squaring, sending a form Q to Q*. The 
group TS(A) is the set of “total symbols” (+1,---,+1) with one coordinate for each 
character defined for discriminant A. The group structure in TS(A) is multiplication 
in each coordinate separately. The function Ch is the “total character” sending each 
form to the values of the various characters on this form. The last function X, is the 
product of characters that measures whether a prime not dividing A is represented 
in discriminant A, so for fundamental discriminants this is all the characters. 

The compositions of successive functions in the five-term sequence above have 
a special property: For each pair of adjacent functions A bs B -> C an element b 
in the middle group B is sent by g to the identity element of C exactly when b is 
equal to the image f(a) of some element a in A. A sequence of functions with this 
property is called an exact sequence. Let us see what this means for each of the three 
middle groups in the five-term sequence above. 


(1) Exactness at the first CG(A) term is the fact that the square Q° of a form Q is 
the identity in CG(A) exactly when Q is symmetric. 

(2) Exactness at the second CG(A) term means that a form Q belongs to the genus 
of the principal form exactly when Q is the square of a form in CG(A). 

(3) Exactness at Ch(A) means that X, has the value +1 on a total symbol exactly 
when this is the total symbol given by the character values of some form. 


Exactness in (1)is fairly easy. In (2) and (3) the easier half of exactness is the statement 
that an element in the image of one function is sent by the next function to the identity 
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element of the next group. These are the statements obtained by omitting the word 
“exactly”. The more difficult half of (2) is Corollary 7.26 which, as we noted above, 
holds not just for fundamental discriminants. The more difficult half of (3) is what 
we showed to prove Theorem 6.26. 


Now let us consider the relationship between genus and the simultaneous repre- 
sentation of numbers by forms of the same discriminant that are not equivalent. 


Proposition 7.28. If two primitive forms of the same discriminant represent the 
same number coprime to the conductor then the two forms are in the same genus. 


For numbers coprime to the discriminant this is a simple consequence of the 
definition of genus, but the result is less obvious in the more general situation, and 
indeed often fails to hold for numbers not coprime to the conductor. An example is 
discriminant —32 with conductor 2 where the two forms [1,0,8] and [3,2,3] both 
represent 8 but have different genus since the character x, is defined when A = —32 
and has the value +1 on [1,0,8] and —1 on [3,2,3]. 


Proof: According to Theorem 7.7 we obtain the various primitive forms representing 
a number n coprime to the conductor as products QI“ e. QO. where the prime 
factorization of n is n = p- pi and Q; represents p;. Changing the exponent 
of Q; from +e; to —e; amounts to multiplying Q‘ by a square Q; *ei and similarly 
for changing the exponent from —e; to +e;. As we noted earlier, multiplying a form 
by the square of another form does not change its genus. So any two primitive forms 


representing n have the same genus. o 


Proposition 7.29. If two primitive forms are of the same genus then there exist 
numbers that are represented by both forms, and in fact there are infinitely many 
such numbers. 


Proof: If the primitive forms Q, and Q, of discriminant A have the same genus then 
there is a form Q such that Q, = Q,Q° in CG(A). Choose a number k represented 
by Q,. We can then choose a number m represented by Q and coprime to k, and 
after this a number n represented by Q and coprime to km, so all three of k, m, and 
n are coprime. Then kmn is represented by both Q,Q° = Q, and QQQ! = Q)- 
There are infinitely many choices possible for n since new choices can always be made 
coprime to all previously chosen numbers. o 


Legendre’s Theorem shows that determining when quadratic curves contain ratio- 
nal points is much easier than determining when they contain integer points. Pursuing 
this idea, our goal in the rest of this section will be to see how the general theory of 
quadratic forms becomes much simpler when rational numbers are used in place of 
integers, and in fact reduces largely to genus theory. 

As an illustration consider the two forms Q(x, y) = x? +14y° and Q(x, y) = 
2x? + 7y° of discriminant —56 that we considered in Section 6.1. These forms 


246 Chapter 7 — The Class Group for Quadratic Forms 


have the same genus since the two characters for this discriminant are x7 and xg 
which both take the value +1 on the two forms. We could also deduce this from 
Proposition 7.28 since both forms represent 15. However, the two forms are not 
equivalent. This means that there is no matrix (e 1) with integer entries and de- 
terminant +1 such that Q,;(px +qy,rx + sy) = Q(x, y). But if we broaden 
our perspective to allow rational numbers as entries then there is such a matrix, 
namely the matrix Y3( 4 í) of determinant +1, since a simple calculation shows 
that Q, C + 7%, 7A + X) = Q(x, y). There are other matrices that could be 
used instead of Y3( A a , for example 1⁄5 ( e 5) 

This example leads us to define two forms Q, and Q, to be rationally equivalent 
if there exists a matrix ee 1) with rational entries and nonzero determinant such that 
Q (px+q4y,rx+sy) = Q(x, y). The determinant condition ensures that the matrix 
has an inverse, also with rational entries, so the change of variables is reversible. In 
the example the determinant was +1, and in this case the forms are said to be properly 
rationally equivalent, or more briefly, properly Q-equivalent. 

Having allowed rational coefficients when we change variables, we can go a step 
further and consider rational forms ax? + bxy + cy* where the coefficients and 
variables are all allowed to be rational numbers. Rational equivalence of rational 
forms is defined just as it was for integral forms in the previous paragraph. 

To see the effect of a rational change of variables on the discriminant of a rational 
form we can use the matrix notation (5 b) for a form [a, b,c] from Section 7.1, where 
b = b/, . The discriminant b? — 4ac is —4 times the determinant of this matrix. When 


we change variables via a rational matrix (4 2) the new form corresponds to the 


matrix (4 A (g 2) A 1) so the discriminant b? — 4ac is multiplied by the square of 
the determinant of the change-of-variables matrix since iC À and t 1 have the 


same determinant. In particular this means that properly Q-equivalent forms have 
the same discriminant. 

Using rational numbers gives added flexibility to prove certain statements that 
do not hold when only integers are allowed. Here are some instances of this: 


Proposition 7.30. (a) If a rational form takes on the nonzero value a then it is 
properly Q-equivalent to a form [a,0,c]. In particular every rational form is 
properly Q-equivalent to a form [a,0,c]. 

(b) If two rational forms of the same discriminant take on the same nonzero value 
then they are properly Q-equivalent. 


Since the discriminant of a form [a,0,c] is —4ac it follows that c is determined 
by a and the discriminant, namely c = ~4/4g. For example the two forms [1,0,14] 
and [2,0,7] of discriminant —56 both take the value 15 so by part (a) of the propo- 
sition they are both properly Q-equivalent to [15, 0, 14⁄5] and hence to each other. 
As another example the principal form x? + Xy + ky? of discriminant 1 — 4k takes 
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the value 1 so it is properly Q-equivalent to x° + aka} y? and this form is rationally 
equivalent to x° + (4k — 1)y?, the principal form of discriminant 4(1 — 4k). 


Proof: Let Q be a rational form taking the nonzero value a when (x,y) = (p,q) 
for rational numbers p and q, not both zero. The numbers p and q form the first 
column of a matrix G R of determinant 1 since the equation ps — qr = 1 always 
has a solution with rational numbers r and s. For example, if p + 0 we can choose 
r = 0 and s = !/, and if q + 0 we can choose s = 0 and r = ~1/4. We use the 
matrix (E A to change variables to get a new form Q(px +ry,qx + sy) properly 
Q-equivalent to Q whose value at (x,y) = (1,0) is Q(p,q) =a. Thus Q is properly 
Q-equivalent to a form [a,b,c] for some rational numbers b and c. This form can 
þe rewritten as: 

p2 


ax? + bxy +cy° =a(x + Zy +(c- aa) 


2 


Thus if we change variables to X = x + bhay and Y = y the form [a, b,c] becomes 
[a,0,c'] for c = c — Dos The matrix for this change of variables has determinant 
1 so the form [a,0,c’] is properly Q-equivalent to [a,b,c] and hence also to the 
original form Q. This proves statement (a). 

Statement (b) follows from (a) since the coefficient c in a form [a,0,c] is deter- 
mined by a and the discriminant when a + 0. o 


For the next proposition we return to forms with integer coefficients. 


Proposition 7.31. Primitive forms of the same genus are properly Q-equivalent. 
For fundamental discriminants the converse is also true: Properly Q-equivalent 
forms have the same genus. 


An example showing the necessity of the extra hypothesis in the converse is pro- 
vided by the forms [1,0,8] and [3,2,3] of discriminant —32 which have different 
genus but are properly Q-equivalent since they both represent 8. 


Proof: For the first statement, two primitive forms of the same genus represent the 
same number by Proposition 7.29, and then the previous proposition says they are 
properly Q-equivalent. 

Conversely, suppose Q and Q’ are primitive forms of discriminant A that are 
properly Q-equivalent. Let k be a number represented by Q. If k is divisible by p° 
for some prime p, say k = p*m, then if A is a fundamental discriminant Theorem 7.7 
implies that Q is equivalent to the product of a form representing m and the square 
of a form representing p. The form representing m is then in the same genus as Q 
and thus also properly Q-equivalent to Q , so for proving the converse we can replace 
Q by this form. After repetitions of this step we can then assume that the number k 
represented by Q is squarefree. 
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Since Q and Q’ are properly Q-equivalent they take on the same rational values 
as the variables range over all rational numbers. Thus there exist integers x,y,z 
such that Q’(*/,,¥/z) = k and hence Q'(x,y) = kz”. We would like to say that 
Q’ represents kz*, and this will be the case if x and y are coprime. Suppose on 
the contrary that x and y are both divisible by some prime p. We can assume p 
does not divide z, otherwise the fractions */, and ”/, could be reduced. Since p 
divides x and y it follows that p? divides Q’(x,y) = kz? and hence p° divides k. 
This contradicts the fact that k is squarefree, so we deduce that Q’ represents kz°. 
Using Theorem 7.7 again and the assumption that A is a fundamental discriminant 
we conclude that Q’ is the product of a form Q” representing k and the square of 
some form representing z, so Q’ and Q” are in the same genus. Since Q and Q” 
both represent k they have the same genus by Proposition 7.28. Hence Q and Q’ 
have the same genus. o 


In the remainder of this section we will describe the classification of rational forms 
up to rational equivalence. The first difference from the classification of integer forms 
up to integer equivalence as in Chapter 5 involves the discriminant. As we have seen, 
a change of variables by a matrix of nonzero rational determinant rv multiplies the 
discriminant by r°. For example the change of variables replacing (x, y) by (rx, y) 
has this effect. This leads us to consider nonzero rational numbers modulo squares, 
so two nonzero rational numbers are regarded as equivalent modulo squares if one is 
obtained from the other by multiplying by the square of a nonzero rational number. 
Every nonzero rational number is equivalent modulo squares to an integer since we 
can multiply by the square of its denominator. Thus ?/g becomes pq, turning division 
into multiplication. After this, any square integer factor of the resulting integer can 
be eliminated by multiplying by the reciprocal of this square factor. In this way every 
equivalence class of nonzero rational numbers modulo rational squares contains a 
squarefree nonzero integer, and this integer is obviously unique. 

In particular every nonzero discriminant is equivalent modulo squares to a unique 
nonzero squarefree integer discriminant which we call a reduced discriminant. When 
we speak of the reduced discriminant of a form we will mean the unique squarefree in- 
teger that is equivalent to its discriminant modulo squares. For example for a nonzero 
squarefree integer d the forms x° — d/, y? and 4x* — dy? both have reduced dis- 
criminant d. Thus all squarefree nonzero integers occur as reduced discriminants. 
A reduced discriminant is a fundamental discriminant if it is congruent to 1 mod 4, 
and otherwise four times the reduced discriminant is a fundamental discriminant. 

A form Q and a nonzero rational multiple rQ have the same reduced discrimi- 
nant. However Q and rQ may not be rationally equivalent. An example is provided by 
the forms x? + y* and 3x? + 3y? with reduced discriminant —1, as we will soon see. 
On the other hand Q and rQ are rationally equivalent since r°Q (x, yY) = Q(rx,ry). 
It follows that every rational form is rationally equivalent to an integer form, so it will 
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suffice to classify integer forms up to rational equivalence. 


Proposition 7.32. If two rational forms of the same reduced discriminant take on 
the same nonzero value then they are rationally equivalent. 


Proof: Let the two forms be Q and Q’. Since they have the same reduced discrim- 
inant there is a rational number r such that the discriminant of Q’ is r* times the 
discriminant of Q. The form Q”’(x,yv) = Q(rx,y) has the same discriminant as 
Q’ and is rationally equivalent to Q, hence has the same values as Q. Thus we may 
assume from the start that Q and Q’ have the same discriminant. Proposition 7.30 
then gives the result. o 


For a fixed reduced discriminant 6 all rational numbers r occur as values of 
rational forms of reduced discriminant ô since if Qg is the principal form for the 
associated fundamental discriminant then rQgo has the same reduced discriminant 
as Qo and takes the value r. Proposition 7.32 then says that the sets of nonzero 
values of forms of reduced discriminant 6 give a partition of the set of all nonzero 
rational numbers into disjoint subsets, and these subsets correspond exactly to the 
rational equivalence classes of forms of reduced discriminant 6. 

AS avery special case, for reduced discriminant 1 there is the form xy and this 
takes on all rational values, so all rational forms of reduced discriminant 1 are ratio- 
nally equivalent. This includes all 0-hyperbolic integer forms since the discriminants 
of these forms are nonzero squares. 

To deal with the general case the following result will be useful: 


Proposition 7.33. The values taken on by a rational form Q(x,y) as x and y 
range over all rational numbers are exactly the values r°Q(x,y) as x and y 
range over all integers and r ranges over all rational numbers. 


Proof: For each integer pair (x, y) and each rational number r we have r°Q (x,y) = 
Q(rx,ry) sorational squares times values at integer pairs are values at rational pairs. 
Conversely, if (x, y) is a rational pair there is a nonzero integer d such that (dx,dy) 
is an integer pair, and then Q(x, y) = d-*Q(dx, dy) so every value at a rational pair 
is a rational square times a value at an integer pair. o 


Multiplying a form by a nonzero square does not affect the signs of its values, so 
the basic distinction between elliptic, hyperbolic, 0-hyperbolic, and parabolic forms 
still holds for rational forms. We have seen that all 0-hyperbolic forms are rationally 
equivalent. The classification of parabolic forms up to rational equivalence is easy 
and will be left as an exercise. This leaves hyperbolic and elliptic forms. For elliptic 
forms we can restrict attention to those taking positive values as we did for integer 
forms. 


As a first example let us work out the classification of forms of reduced discrimi- 
nant —1 up to rational equivalence. The associated fundamental discriminant is —4, 
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with class number 1 so all integer forms of discriminant —4 are equivalent to x*+°. 
The values of this form for integers x and y are all the positive numbers whose prime 
factorization contains primes p = 3 mod 4 only to even powers. The values for ratio- 
nal x and y are then all such products where negative as well as positive exponents 
on primes are allowed. 

Consider next the form 3x* + 3y* which also has reduced discriminant —1. The 
values this form takes on for rational x and y can be described in the same way as for 
x? + y? except that now the exponent on the prime 3 must be odd rather than even. 
Thus this form is not rationally equivalent to x*+*. More generally, for any finite set 
of primes p,,---,p, congruent to 3 mod 4 the values of the form p; -- - Dy lx? +’) 
are the products in which each p; has odd exponent. Different sets of primes p; = 3 
mod 4, including the empty set for the form x° + y*, give forms taking on disjoint 
sets of values, so all these sets of primes give different rational equivalence classes 
of forms. Every rational equivalence class is realized in this way since one can take 
any form in this class and any nonzero value r this form takes on, then take the set 
of primes p; = 3 mod 4 that occur to an odd power in the prime factorization of r. 
Thus we have determined all of the infinitely many rational equivalence classes of 
forms of reduced discriminant —1. 

Other fundamental discriminants of class number 1 work in the same way. For 
example for discriminant —3 we have the form x° +xy +y? whose values are products 
of primes in which primes p = 2 mod 3 occur only to even powers. The rational 
equivalence classes then correspond to multiples of x? + Xy + y? by finite products 
of distinct primes p = 2 mod 3. Instead of x° + xy + y* we could use x° + 3y? 
which has the same reduced discriminant and is rationally equivalent to x? + xy + y? 
since both forms take the value 3. 

In the general case the rational classification of forms of a given reduced dis- 
criminant 6 involves the different genera of forms of the associated fundamental 
discriminant A. By Proposition 7.31 each of these genera corresponds to exactly one 
rational equivalence class of forms. Choose one form Q; in each of these genera. The 
values of integer forms of discriminant A are the numbers whose prime factorization 
contains certain primes only to even powers, namely the primes not represented in 
discriminant A, which are the primes in certain congruence classes mod A. The ra- 
tional equivalence classes for reduced discriminant 6 then correspond exactly to the 
forms p,-:- P,Q; where p,,---,P, are distinct primes not represented in discrim- 
inant A. 


Exercises 


1. Find all the instances in the large table in Section 6.2 where two primitive forms of 
the same discriminant but different genus represent the same power of the conductor. 
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2. For discriminant A = —260 the equivalence classes of forms were worked out 
in Section 5.2. Show that CG(A) is C, x C,, partition the forms into genera, and 
determine the order of each element of CG(A). Which elements are squares of other 
elements? 


3. For discriminant A = —119 = —7-17 show that CG(A) is cyclic, determine its 
order, and find forms giving all the elements. Then partition these elements according 
to their genus and determine the order of each element. (All this can be done without 
actually computing any products using concordant pairs of forms.) 


4. (a) Determine the values of n for which the curve 2x* +ny7 = 1 contains rational 
points, assuming n is odd and squarefree. For each of the first three positive values 
of n for which the curve contains rational points find two of these rational points 
that lie in the first quadrant. 

(b) Show that the case that n is even and squarefree reduces to the previous case. 


5. Determine the values of n for which the curve 3x° + ny? = 1 contains rational 
points, assuming n is odd, squarefree, and coprime to 3. 


6. What is the classification of rational forms of discriminant 0 up to rational equiv- 
alence? 
7. Show that for each nonzero reduced discriminant 6 there is a unique form x*+by? 


of reduced discriminant 6 with b a squarefree integer, and show that every form of 
reduced discriminant 6 is rationally equivalent to a form a(x? + by”). 


S Quadratic Fields 


Even when one’s primary interest is in integers it can sometimes be very helpful 
to consider more general sorts of numbers. For example, when studying the principal 
quadratic form x° — Dy? of discriminant 4D it can be a great aid to understanding 
to allow ourselves to factor this form as (x + y/D)(x — yD). Here we allow D to 
be negative as well as positive, in which case we would be moving into the realm of 
complex numbers. 

To illustrate this idea, consider the case D = —1, so the form is x° + y? which 
we are factoring as (x + yi)(x — yi). Writing a number n as a sum a° + b? is then 
equivalent to factoring it as (a + bi)(a— bi). For example 5 = 27+1° = (2+i)(2-i), 
and 13 = 3° + 2° = (3 + 2i)(3 — 2i), so 5 and 13 are no longer prime when we allow 
factorizations using numbers a + bi. Sometimes a nonprime number such as 65 can 
be written as the sum of two squares in more than one way: 65 = 8° + 1° = 4° +7°%, 
so it has factorizations as (8 + 1)(8 — i) and (4 + 7i)(4 — 7i). This becomes more 
understandable if we factorize 65 as: 


65 = 5-13 = (2 + 1)(2 —1)(3 + 2i) (3 — 2i) 


If we combine these four terms as (2 — i)(3 + 2i) = 8+ i and (2 + i)(3 - 2i) = 8-i 
we get the representation 65 = 8° + 1° = (8 + i)(8 — i), whereas if we combine them 
as (2+1)(3 +21) = 4+7i and (2 —i)(3 — 2i) = 4—7i we get the other representation 
65 = 4° +7? = (4 +7i)(4 - 7i). 

More generally we will consider the set Z[VD] of all numbers x + yV/D with x 
and y integers. Thus Z[/D] consists of real numbers if D > 0 and complex numbers 
if D < 0. We will always assume the integer D is not a square, so Z[VD] is not just Z. 
When D = -1 we have Z[/D] = Z[i], and numbers x + yi in Z[i] are known as 
Gaussian integers. 

We will also have occasion to consider numbers x + y./D where x and y are 
allowed to be rational numbers, not just integers. The set of all such numbers is 
denoted Q(/D) with round parentheses instead of square brackets to emphasize that 
Q(J/D) isa field while Z[VD] is only a ring. In other words, in Q(/D) we can perform 
all four of the basic arithmetic operations of addition, subtraction, multiplication, and 
division, whereas in Z[,/D] only the first three operations are possible in general. 
Division by a nonzero element x + yVD of Q(VD) is possible since it amounts to 
multiplication by its reciprocal 1/(x + y VD) = (x - yVD)/(x* —- Dy*) which lies in 
Q(/D) when x and y are rational. 


Section 8.1 — Prime Factorization 253 


8.1 Prime Factorization 


The ring of Gaussian integers Z[i] can be pictured as a subset of the plane, viewed 
as complex numbers in the usual way with x + yi corresponding to the point with 
coordinates (x,y). Thus Z[i] forms a square grid consisting of the points (x,y) 
with x and y integers: 


e e e e e e e 
—-34+2i -24+2i -1+2i 2i 1+2i% 2+ 21% 34+ 2% 
e e e e e e e 
-3+i -2+i -l+i i l1+i 2+i 3+í 
e e e o e Q e 
-3 -2 -1 (0) 1 2 3 
e e e e e e e 
3-1 2-1 l-i 1 l-i 2-1 3-1 
e e e e e e e 


3324. 22i =L= 2i L241. 2-20 324 


Similarly, the ring Z[VD] with D < 0 forms a grid of complex numbers forming 
rectangles of height .//D| obtained by stretching the previous figure vertically. 

When D > 0 the numbers in Z[VD] are real numbers which would normally be 
regarded as points along the x-axis. However, there is another point of view that will 
make the case D > 0 look just like the case D < 0, and this is to regard a number 
x+y/D in Z[/D] or more generally Q(/D) as the point (x, y vD) in the plane. Thus 
for example Z[,/2] is exactly the same rectangular grid as Z[,/—2], with rectangles 
of width 1 and height v2. From this point of view the horizontal and vertical axes 
of the plane, instead of being the real and imaginary axes, are now regarded as the 
“rational and irrational axes”, with the two coordinates x and y./D being the rational 
and irrational parts of x + yVD. 

A useful operation with complex numbers is to pass from a number x + yi to 
its complex conjugate x — yi obtained by reflecting across the x-axis. In Z[/D] or 
Q(/D) with D < 0 this amounts to replacing x + y vD by its conjugate x — yD. 
When D > 0 we can do exactly the same operation of reflecting x + y./D across the 
x-axis to the point x — y vD, which we again call the conjugate of x + yVD. 


The ring Z[/D] is useful for factoring the form x*—Dy? as (x+yVD)(x-yvD). 
For this form the discriminant A = 4D is 0 mod 4, and it would be nice to treat also 
the discriminants A = 4d + 1 = 1 mod 4, when the principal form is x° + xy —dy?. 
This factors in the following way: 


1+ V1+4d 1-V1+4d 
(Ey) 


x? + xy -dy* = (x 5 5 


254 Chapter 8 — Quadratic Fields 


To simplify the notation we let w = (1 + v1 +4d)/2 and W = 1 — V1 + 4d)/2, the 
conjugate of w, so the factorization becomes x? + xy —dy* = (x + wy)(x + Wy). 
The quadratic equation satisfied by w is w° — w — d = 0. Thus w° = w + d and this 
allows the product of two numbers of the form m + nw to be written in the same 
form. In other words, the set Z[w] of all numbers x + yw with x and y integers 
is a ring, just like Z[VD]. Note that W is an element of Z[w] since w + W = 1, so 
@=1-Ww 


For example, when d = —1 we have w = (1 + \/—3)/2 and the elements of Z[w] 
form a grid of equilateral triangles in the xy-plane: 


—-4+2w -3+2w -2+2w —-1+2w 2w 1+2w 2+2% 
-4+0 gei 24H -140 w eras 2+w 34w 
A a a Ó 1 2 3 4 
J o ito a i 0h Jew Ai 


2-2W 1-2wWw 2W 172w 2-2w 3-2w 4-2w 


The picture for larger negative values of d is similar but stretched in the vertical 
direction, forming a grid of isosceles triangles. For positive values of d we can do 
the same thing we did before with Z[VD] and regard Z[w] as a grid in the plane. 
For example, for d = 1 we have w = (1 + v5)/2 and Z[w] looks like the picture 
for d = —1 stretched in the vertical direction so that the y-coordinate of w is v5/2 
rather than v3/2. 


Elements of Z[w] can always be written in the form m+nw = (a+bv1 + 4d) /2 
for suitable integers a and b. Here a and b must have the same parity since this 
is true for w = (1 + v1 +4d)/2 and hence for any integer multiple nw, and then 
adding an arbitrary integer m to nw preserves the equal parity condition since it 
adds an even integer to a. Conversely, if two integers a and b have the same parity 
then (a + b\/1 + 4d)/2 lies in Z[w] since by adding or subtracting a suitable even 
integer from a we can reduce to the case a = b when one has a multiple of w. Notice 
that having both a and b even is equivalent to (a +bv1 + 4d) /2 lying in Z[/1 + 4d], 
so Z[/1 + 4d] is a subset of Z[w]. In the figure above we can see that Z[,/1 + 4d] 
consists of the even rows, the numbers m + nw with n even. 

To have a unified notation for both the cases Z[VD] and Z[w] let us define R4 
to be Z[/D] when the discriminant A is 4D and Z[w] when A is 4d +1. We will 
often write elements of R, using lower case Greek letters, for example « = x + y/D 
in Z[VD] with conjugate ® = x — yVD, or « = x + yw in Z[w] with conjugate 
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Q=x+yw=x+y(l-w)=(x+y)- yw. 


The main theme of this section and the next will be how elements of Ra factor 
into “primes” within R4. For example, if a prime p in Z happens to be representable 
as p = x? — Dy* then this is saying that p is no longer prime in Z[VD] since it 
factors as p = (x + yVD)(x — yVD) = «ð for & =x + yVD and Q = x — yVD. Of 
course, we should say precisely what we mean by a “prime” in Z[IVD] or Z[w]. For an 
ordinary integer p > 1, being prime means that p is divisible only by itself and 1. If 
we allow negative numbers, we can “factor” a prime p as (—1)(—p), but this should 
not count as a genuine factorization, otherwise there would be no primes at all in Z. 
In R; things can be a little more complicated because of the existence of units in R,, 
the nonzero elements £ in Ra whose inverse £~} also lies in R,. For example, in 
the Gaussian integers Z[i] there are four obvious units, +1 and +i, where for +i we 


= 1 _ i. We will see in a little while that these 


have (i)(—i) = 1 so i` = —i and (—i)~ 
are the only units in Z[i]. Having four units in Z[i] instead of just +1 complicates 
the factorization issue slightly, but not excessively so. 

For positive values of A things are somewhat less tidy because there are always 
infinitely many units in R,. For example, in Z[/2] the number £ = 3 + 2v2 isa 
unit because (3 + 2/2)(3 — 2/2) = 1. All the powers of 3 + 2v2 are therefore 
also units, and there are infinitely many of them since 3 + 2V2 > 1 so the powers 
(3 + 2V2)” form an increasing infinite sequence approaching +o. Their inverses 
(3 + 2v2)” = (3 — 2/2)” are a decreasing infinite sequence approaching 0. 

Whenever <€ is a unit in R, we can factor any number « in R4 as & = (awe) (E'). 
If we allowed this as a genuine factorization there would be no primes in R4, so it 
is best not to consider it as a genuine factorization. This leads us to the following 
definition: An element « of Rj, is said to be prime in R, if it is neither 0 nor a unit, 
and if whenever we have a factorization of « as « = By with both £ and y in R4, 
then it must be the case that either f or y is a unit in R,. Not allowing units as 
primes is analogous to the standard practice of not considering 1 to be a prime in Z. 

If we replace Ry by Z in the definition of primeness above, we get the condition 
that an integer a in Z is prime if its only factorizations are the trivial ones a = 
(a)(1) = (1)(a) and a = (—a)(-1) = (—1)(—a), which is what we would expect. 
This definition of primeness also means that we are allowing negative primes as the 
negatives of the positive primes in Z. 

A word of caution: An integer p in Z can be prime in Z but not prime in R4. For 
example, in Z[i] we have the factorization 5 = (2 + i)(2 — i), and as we will be able 
to verify soon, neither 2 + i nor 2 —i is a unit in Z[i]. Hence by our definition 5 is 
not a prime in Z[i], even though it is prime in Z. Thus one always has to be careful 
when speaking about primeness to distinguish “prime in Z” from “prime in R,”. 

Having defined what we mean by primes in R, it is then natural to ask whether 
every nonzero element of R, that is not a unit can be factored as a product of primes, 
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and if so, whether this factorization is in any way unique. As we will see, the existence 
of prime factorizations is fairly easy to prove, but the uniqueness question is much 
more difficult and subtle. To clarify what the uniqueness question means, notice first 
that if we have a unit € in Ra we can always modify a factorization « = By to give 
another factorization œ = (e8)(e-ly). This is analogous to writing 6 = (2)(3) = 
(—2)(—3) in Z. This sort of nonuniqueness is unavoidable, but it is also not too 
serious a problem. So when we speak of factorization in R, being unique, we will 
always mean unique up to multiplying the factors by units. 

A fruitful way to study factorizations in R, is to relate them to factorizations in 
Z by associating to each element « in Ry, the number N(&) = && called the norm 
of æ. Thus in the two cases Ry = Z[VD] and R, = Z[w] we have: 

N(x + yvVD) = (x + yVD)(x - yVD) = x° - Dy” 
N(x + yw) = (x+ yw)(x + yO) =x? +xy—-dy’ 

In both cases N(«) is an integer. Notice that when the discriminant is negative, so 
& is a complex number a + bi for real numbers a and b, the norm of « is just 
XX = (a + bi)(a — bi) = a° + b?, the square of the distance from « to the origin in 
the complex plane. When the discriminant is positive the norm can be negative so it 
does not have a nice geometric interpretation in terms of distance, but it will be quite 
useful in spite of this. 

The reason the norm is useful for studying factorizations is that it satisfies the 
following multiplicative property: 


Proposition 8.1. N(«B) = N(«)N(B) for all « and B in R,. 


Proof: We will deduce multiplicativity of the norm from multiplicativity of the conju- 
gation operation, the fact that «8 = XB. The argument will apply more generally to 
all elements of Q(/D) for any integer D that is not a square. To verify that xB = XB, 
write «= x+yV/D and B = z + wvVD, so that «œf = (xz + ywD) + (xw + yz)VD. 
Then: 


aB = (xz + ywD) — (xw + yz)VD = (x — yVD)(z — wVD) = &B 
For the norm we then have N(«B) = («B)(«B) = BTB = xXBB=N(a)N(B). o 


Using the multiplicative property of the norm we can derive a simple criterion for 
recognizing units: 


Proposition 8.2. An element £ € R, is a unit if and only if N(€) = +1. 


Proof: Suppose € is a unit, so its inverse £7! also lies in Ra. Then N(e)N(e7!) = 
N(ee!) = N(1) = 1. Since N(e) and N(e7') are integers this forces N(e) to be 
+1. For the converse we use the fact that a nonzero element « in R, has inverse 
a! = &/N(a) since «(&/N(x)) = 1. Hence if N(e) = +1 we have £~! = +Z which 
is an element of R, if € is, so € is a unit. Oo 
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When A is negative there are very few units in Ra. In the case of Z[VD] the 
equation N(x + yV/D) = aon Dy? = +1 has very few integer solutions when D < 0, 
namely, if D = —1 there are only the four solutions (x,y) = (+1,0) and (0, +1) while 
if D < —1 there are only the two solutions (x,y) = (+1,0). Thus the only units in 
Z[i] are +1 and +i, and the only units in Z[VD] for D < —1 are +1. Geometrically 
this is saying that these are the only points in the grid Z[/D] of distance 1 from 
the origin, which is obviously true. In the case of Z[w] one can see from the earlier 
figure of Z[w] that there are just six points of Z[w] of distance 1 from the origin 
when d = —1, and only the two points +1 when d < —1 and the figure is stretched 
vertically. When d = —1 the six units are +1, +w, and +(w-—1). These are the powers 
w” for n = 1,2,3,4,5,6 since the general formula w? = w +d gives wê = w-1 
when d = —1, and from this it follows that w? = -1, œw = —w, w° = 1 —- w, and 
w = 1. When d < —-1 the only units in Z[w] are +1. 

The situation for R, with A positive is quite different. For Z[VD] we are looking 
for solutions of x? — Dy? = +1 with D > 0, while for Z[w] the corresponding 
equation is x° + xy — dy? = +1 with d > 0. We know from our study of topographs 
of hyperbolic forms that these equations have infinitely many integer solutions since 
the value 1 occurs along the periodic separator line in the topograph of the principal 
form when (x,y) = (1,0), so it appears infinitely often by periodicity. For some 
values of D or d the number —1 also appears along the separator line, and then it 
too appears infinitely often. Thus when A > 0 the ring R, has infinitely many units 
€=x+yVvD or x + yw, with arbitrarily large values of x and y. 

There is a nice interpretation of units in R, as symmetries of the topograph of 
the principal form of discriminant A. A unit € in R, defines a transformation T, of 
R, by the formula T,(«) = cæ. In the case of Z[VD], if £ = p+ qV/D then 


T(x + yVD) = (p + qvD)(x + yvD) = (px + Day) + (qx + py)VD 
while for Z[w], if € = p +qw we have 


T(x + yw) = (p + qw)(x + yw) = (px + qyw*) + (qx + py)w 
= (px + dqy) + (qx + (p +q)y)w 


since w* = w +d. In both cases we see that T, is a linear transformation of x 
and y, with matrix y a in the first case and ( a) in the second case. The 
determinants in the two cases are p° — Dq? and p° + pq — dq* which equal N(¢) 
and hence are +1 since € is a unit. Thus T, defines a linear fractional transformation 
giving a symmetry of the Farey diagram. Since N(eéa) = N(e)N(«) we see that T, is 
an orientation-preserving symmetry of the topograph of the norm form when N (€) = 
+1 and an orientation-reversing skew symmetry when N (€) = —1. The symmetry 


corresponding to the “universal” unit € = —1 is just the identity since ~*/_y = */y. 
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When A < 0 the only interesting cases are A = —3, when T, for € = w isa 
120 degree rotation of the topograph, and A = —4 when T, for € = i rotates the 
topograph by 180 degrees. 

When A > O there is a fundamental unit £ corresponding to the +1 in the 
topograph of the norm form at the vertex p/q with smallest positive values of p 
and q. When N(¢) = +1 the transformation T, is then the translation giving the 
periodicity along the separating line since it is an orientation-preserving symmetry. 
In the opposite case N (€) = —1 the transformation T, is an orientation-reversing skew 
symmetry so it must be a glide reflection along the separator line by half a period. 


Proposition 8.3. When A > 0 the units in R, are exactly the elements +e" for 
n € Z, where = is the fundamental unit. 


Proof: The units appear along the separator line at the regions */y where the norm 
form takes the value +1. From our previous comments, these are the points T” (1%) 
as n varies over Z. Since T, is multiplication by €, the power T” is multiplication 
by £”. Thus the values +1 occur at the regions labeled */) for £” = x + y VD or 
€" = x+ yw. The units are therefore the elements +£” where the + comes from the 
fact that the topograph does not distinguish between (x,y) and (—x,-y). o 


The conjugation operation in R, sending « to & also gives a symmetry of the 
topograph of the norm form since N(«) = N(&). Conjugation in Z[VD] sends an 
element x + yVD to x —yVD so in the Farey diagram it is reflection across the edge 
joining Yo and 9. Conjugation in Z[w] sends x + yw tox + y@W = (x+y) - yw 
since W = 1 — w, so conjugation fixes the vertex 1% and interchanges 94 and 71 
by reflecting across the line perpendicular to the edge from 9 to ~Y,. 


Proposition 8.4. All symmetries and skew symmetries of the topograph of the norm 
form are obtainable as combinations of conjugation and the transformations T, 
associated to units £ in Ry. 


Proof: It will suffice to reduce an arbitrary symmetry or skew symmetry T to the 
identity by composing with conjugation and transformations T,. If T is a skew sym- 
metry we must have A > 0 with —1 appearing along the separator line as well as +1. 
Composing T with a glide reflection T, then converts T into a symmetry, so we may 
assume T is asymmetry from now on. If T reverses orientation of the Farey diagram 
we may compose it with conjugation to reduce further to the case that T preserves 
orientation. When A < 0 the only possibility for T is then the identity except when 


A = -4 and T = T, for £ = i, or when A = -3 and T = T, for £ = w or œw”. If 
A > 0 the only possibility for T is a translation along the separator line, which is T, 
for some unit €. m 


Now we begin to study primes and prime factorizations in R4. First we have a 
useful fact: 
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Proposition 8.5. If the norm N(«) of an element « in R, is prime in Z then « is 
prime in R,. 


For example, when we factor 5 as (2 + i)(2 — i) in Z[i], this proposition implies 
that both factors are prime since the norm of each is 5, which is prime in Z. 


Proof: Suppose an element « in R, has a factorization « = By, hence N(«) = 
N(B)N(y). If N(&) is prime in Z, this forces one of N(f) and N(y) to be +1, hence 
one of Bf and y is a unit. This means «& is a prime since it cannot be 0 or a unit, as 
its norm is a prime. o 


The converse of this proposition is not generally true. For example the number 
3 has norm 9, which is not prime in Z, and yet 3 is prime in Z[i]. This is because 
if we had a factorization 3 = xf in Z[i] with neither œ nor f a unit, then the equa- 
tion N(a«)N(B) = N(3) = 9 would imply that N(&œ&) = +3 = N(f), but there are 
no elements of Z[i] with norm +3 since the equation x° + y? = +3 has no integer 
solutions. 


Now we can prove that prime factorizations always exist: 


Proposition 8.6. Every nonzero element of R, that is not a unit can be factored as 
a product of primes in R;. 


Proof: We argue by induction on |N(«)|. Since we are excluding 0 and units, the 
induction starts with the case |N(«)| = 2. In this case « must itself be a prime by the 
preceding proposition since 2 is prime in Z. For the induction step, if « is a prime 
there is nothing to prove. If « is not prime, it factors as & = By with neither p nor 
y a unit, so |N(f)| > 1 and |N(y)| > 1. Since N(«) = N(B)N(y), it follows that 
IN(B)| < IN(&œ)| and |N(y)| < |N(«)|. By induction, both f and y are products of 
primes in R,, hence their product « is also a product of primes in R4. o 


Let us investigate how to compute a prime factorization by looking at the case 
of Z[i]. Assuming that factorizations of Gaussian integers into primes are unique 
(up to units), which we will prove later, here is a procedure for finding the prime 
factorization of a Gaussian integer & = a + bi: 


(1) Factor the integer N(«) = a° + b° into primes p, in Z. 

(2) Determine how each p, factors into primes in Z[i]. 

(3) By the uniqueness of prime factorizations, the primes found in step (2) will be 
factors of either a + bi or a — bi since they are factors of (a + bi) (a — bi), so all 
that remains is to test which of the prime factors of each p, are factors of a + bi. 


To illustrate this with a simple example, let us see how 3 + i factors in Z[i]. We have 
N(3 +i) = (3 + 1)(3 — i) = 10 = 2-5. These two numbers factor as 2 = (i + i)(1 — i) 
and 5 = (2 + 1)(2 — i). These are prime factorizations in Z[i] since N(1 +1) = 2 and 
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N(2 +i) = 5, both of which are primes in Z. Now we test whether for example 1 + i 
divides 3 + í by dividing: 

3+1 z (3+i)(l1-— ʻi) _ 4-2i ao} 

1+i (1+i)(1-i) 2 
Since the quotient 2 — í is a Gaussian integer, we conclude that 1 +i is a divisor of 3 +i 
and we have the factorization 3 + i = (1 + i)(2 — i). This is the prime factorization of 


3 + i since we have already noted that both 1 +i and 2 —i are primes in Z[i]. 

For a more complicated example consider 244 + 158i. For a start, this factors 
as 2(122 + 79i). Since 122 and 79 have no common factors in Z we cannot go any 
farther by factoring out ordinary integers. We know that 2 factors as (1 + i)(1 — i) 
and these two factors are prime in Z[i] since their norm is 2. It remains to factor 
122 +79i. This has norm 122° + 79° = 21125 = 5°-13°. Both 5 and 13 happen to 
factor in Z[i], namely 5 = (2 + i)(2 —i) and 13 = (3 + 2i)(3 — 21), and these are 
prime factorizations since the norms of 2 +i and 3 + 2i are 5 and 13, primes in Z. 
Thus we have the prime factorization 


(122-4701) (122 = 79i) = 55-13? = (2 + i)? (2 - i)? (3 + 2i)? 3 = 2i)? 


Now we look at the factors on the right side of this equation to see which ones are 
factors of 122 + 79i. Suppose for example we test whether 2 + i divides 122 + 791: 
122+79i A122 6702 1). --<3 232 36k 

24+1 (2 + i)(2 — i) 5 
This is not a Gaussian integer, so 2 + i does not divide 122 + 79i. Let us try 2 — i 


instead: i , l : 
122+ 79i (122+ 79i)(2+i) _ 165+ 280i _ 


er TOS 5 
So 2—i does divide 122 +791. In fact, we can expect that (2—i)? will divide 122 +79i, 
and it can be checked that it does. In a similar way one can check whether 3 + 2i or 
3 — 2i divides 122 + 79i, and one finds that it is 3 — 2i that divides 122 + 79i, and 
in fact (3 — 2i)? divides 122 + 79i. After these calculations one might expect that 
122 + 79i was the product (2 — i)?(3 — 2i)*, but upon multiplying this product out 
one finds that it is the negative of 122 + 79i. Thus: 


33 + 561 


122 4791 =(-1)(2'= i)? B= 2i)? 


The factor —1 is a unit, so it could be combined with one of the other factors, for 
example changing one of the factors 2 —i to i — 2. Alternatively, we could replace 
the factor —1 by i? and then multiply each 3 — 2i factor by i to get a neater prime 
factorization: 

122. 4°79% = (27) (24 31)" 


Combining these calculations, we have the prime factorization for 244 + 158i: 


244 + 158i = (1+ i)(1 — i)(2 — i)? (2 + 31)” 
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The method in this example for computing prime factorizations in Z[i] depended 
on unique factorization. When unique factorization fails, things are more compli- 
cated. One of the simplest instances of this is in Z[,/—5] where we have the factor- 
izations: 

6 = (2)(3) = (1+ V=5)(1 - V=5) 
The only units in Z[/—5] are +1, so these two factorizations do not differ just by 
units. We can see that 2, 3, and 1 + /—5 are prime in Z[/—5] by looking at norms. 
Using the formula N (x+y V-5) = x7+5y° we see that the norms of 2, 3, and 1+,/—5 
are 4, 9, and 6, so if one of 2, 3, or 1+./—5 was not a prime, it would have a factor of 
norm 2 or 3 since these are the only numbers that occur in nontrivial factorizations 
of 4, 9, and 6 in Z. However, the equations x° +5y° = 2 and x*+5y* = 3 obviously 
have no integer solutions so there are no elements of Z[/—5] of norm 2 or 3. Thus 
in Z[/—5 ] the number 6 has two prime factorizations that do not differ just by units. 

This example can be explained by the fact that x? +5? is not the only quadratic 
form of discriminant —20, up to equivalence. Another form of the same discriminant 
is 2x°+2xy+3y7", and this form takes on the values 2 and 3 that the form x? +5" 
omits, even though x? +5y°? does take on the value 6 = 2-3. Here are the topographs 
of these two forms, with prime values circled: 


x*+5y? 


2x? +2xy +3y? 


63 


The appearance of 6 in the topograph of x* + 5y? = (x + yV—5)(x — yV—5) when 
x/y = 1 gives the factorization 6 = (1 + /—5)(1 - V-5). 

The boxed nonprime numbers in the topograph of x° + 5° give rise to other 
nonunique prime factorizations. For example 14 = (2)(7) = (3 + V—5)(3 — V—5) 
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where the second factorization comes from the appearance of 14 in the topograph of 
x? +5y* when Ly = 3/,. As with the earlier factorizations of 6, the nonappearance 
of 2 and 7 in the topograph of x? + 5° implies that 2, 7, and 3 + /—5 are prime in 
Z[/—5]. Some numbers in the topograph of x° + 5y* occur in boxes twice, leading 
to three different prime factorizations. Thus 21 factors into primes in Z[,/—5] as 
3-7, as (1 + 2/—5)(1 — 2V—5) and as (4 + /—5)(4 — V/—5). Another example is 
69 = 3-23 = (7 + 2,/-5)(7 — 2V=5) = (8 + V—5)(8 — V—5). 

The numbers that appear in the topograph of the second form 2x°+2xy+3y? are 
not the norms of elements of Z[,/—5] but one might imagine that they are the norms 
of “ideal numbers” of some sort. Thus 2 might be the norm of an ideal number P, so 
2 = PP, and 3 might be the norm of an ideal number Q, so 3 = QQ. The product 
PQ would then have norm (PQ)(PQ) = (PP)(QQ) = 2-3 = 6, so it is possible that 
PQ =1+~,-—5. If this is true, it would explain very nicely the two factorizations of 6 
as 2-3 = (PP)(QQ) and as (1 + V—5)(1 — V—5) = (PQ) (PQ). 

One can also see how some numbers might have three different prime factoriza- 
tions. For example for 21 = 3-7, if we have 3 = PP and 7 = QQ then there are three 
ways to group these four ideal numbers into pairs, as (PP)(QQ), as (PQ)(PQ), and 
as (PQ)(PQ), and these three groupings could give the three factorizations of 21. 
The reason there are only two factorizations for 2-3 and 2-7 is that in the factoriza- 
tion 2 = PP the two factors P and P happen to be equal, so there is no difference 
between (PQ)(PQ) and (PQ)(PQ). 

Much of this chapter will be devoted to making sense of these “ideal numbers”. 
They will be realized not by actual numbers but by certain sets of numbers in Ra 
called simply “ideals”. These ideals behave like actual numbers in some respects. 
Most importantly they can be multiplied and they have norms which are ordinary 
integers, behaving much like norms of elements of R,. On the other hand there is 
no method for adding ideals that behaves like addition of numbers, so ideals are not 
entirely like numbers. However, this will not matter for studying prime factorizations 
where multiplication is what one is interested in. 

There is a natural notion of what a prime ideal is, and the big theorem about 
ideals in R, is that they have unique factorizations into prime ideals when A is a 
fundamental discriminant. This gives information about prime factorizations of ele- 
ments of R, because each element of R, gives rise to a special kind of ideal called 
a principal ideal. For some discriminants all ideals are principal ideals, and for these 
discriminants the unique prime factorization of ideals translates into unique prime 
factorization of elements of R,. 

In the remainder of this section and continuing in the next we will go further 
into prime factorizations of elements of R, before beginning the study of ideals in 
Section 8.3. 
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The question of how a prime p in Z factors in R, can be rephrased in terms of 
the norm form x? — Dy? or x°? + xy — dy’, according to the following result: 


Proposition 8.7. Let p be a prime in Z. Then: 

(a) If either p or —p is represented by the norm form for Ry, so N(«) = +p for 
some & in Ry, then p factors in Rx as p = +&Ñ and both these factors are 
prime in Ry. 

(b) If neither p nor —p is represented by the norm form then p remains prime 
in R,. 


In statement (a) note that when A < 0 the norm only takes positive values, so if a 
positive prime p factors in R, it must factor as p = «Q, never as —X&. However for 
A > 0 this need not be the case. For example for Z[/3] the topograph of x° — 3y? 
shown in Section 4.1 contains the value —2 but not 2, so the prime 2 factors as 


—~(1 + /3)(1 — V3) in Z[V3] but not as 0. 


Proof: For part (a), if p = +N(a), then p factors in R, as p = +40 = +N(a). The 
two factors are prime since their norm is +p which is prime in Z by assumption. 
For (b), if p is not a prime in R, then it factors in Ra as p = wf with neither 
« nor B a unit. Then N(p) = p? = N(«)N(B) with neither N(x) nor N(B) equal to 
+1, hence we must have N(«) = N(fB) = +p. The equation N(«) = +p says that the 
norm form represents +p. Thus if the norm form represents neither p nor -p then 
p must be prime in R,. Oo 


Proposition 8.8. If Ra has unique factorization into primes then the only primes 
in R, are the primes described in (a) or (b) of the preceding proposition, or units 
times these primes. 


This can be false without unique prime factorization since the primes in R, ob- 
tained by factoring a prime integer p have norm dividing N(p) = p°, but we have 
seen for example that 1 + v—5 is prime in Z[/—5] and has norm 6. 


Proof: Let « be an arbitrary prime in R,. The norm n = N(«) = c@ is an integer in 
Z so it can be factored as a product n = p; --- py, of primes in Z. By Proposition 8.7 
each p; either stays prime in R, or factors as a product +&;Q; of two primes in R4. 
This gives a factorization of n into primes in Ry. A second factorization of n into 
primes in R, can be obtained from the formula n = wa by factoring X into primes 
since the first factor « is already prime by assumption. (In fact if œ is prime then & 
will also be prime, but we do not need to know this.) If we have unique factorization 
in R, then the prime factor « of the second prime factorization will have to be one 
of the prime factors in the first prime factorization of n, or a unit times one of these 
primes. Thus « will be a unit times a prime of one of the two types described in 
Proposition 8.7. o 


264 Chapter 8 — Quadratic Fields 


There are two qualitatively different ways in which a prime p in Z can factor as 
the product of two primes in R4, depending on whether the two primes in R, differ 
by just a unit, or equivalently, whether p is a unit times the square of an element of 
R,. For example in Z[i] we have 2 = (1 +1)(1—1i) and the two factors 1 +i and 1—i 
differ only by a unit since —i(1 + i) = 1 — i. Thus 2 = (1 + i)? for the unit € = —i. 
In fact 2 is the only prime that can be factored in Z[i] as p = c(a + bi)? for some 
unit £. The units in Z[i] are +1 and +i so the only possibilities are p = +(a + bi)? 
and p = +i(a + bi)? . In the first case p = +(a+bi)* = +(a*—b* + 2abi) so 2ab = 0 
hence either a or b is 0, but that would say p = +a’ or p = +b? whichis impossible 
since p is prime. The other case is p = +i(a + bi)? = +((a* — b*)i— 2ab) hence 
p = +2ab so a and b are +1 and p = 2. 


Exercises 


1. (a) Show that if « and £ are elements of Z[VD] such that « is a unit times £, then 
N(a) = +N(). 

(b) Either prove or give a counterexample to the following statement: If & and f are 
Gaussian integers with N(«) = N(B) then « is a unit times £. 


2. Show that a Gaussian integer x + yi with both x and y odd is divisible by 1 + i 
but not by (1 + i)?. 


3. There are four different ways to write the number 1105 = 5-13-17 as a sum of two 
squares. Find these four ways using the factorization of 1105 into primes in Z[i]. 
Here we are not counting 5°+2° and 27+5° as different ways of expressing 29 as the 
sum of two squares. Note that an equation n = a° + b? is equivalent to an equation 
n =(a+bi)(a-—bi). 

4. (a) Find four different units in Z[/3] that are positive real numbers, and find four 


that are negative. 
(b) Do the same for Z[/11]. 
5. Make a list of all the Gaussian primes x + yi with -7 < x < 7 and -7 < y <7. 


(The only actual work here is to figure out the primes x + yi with O < y<x <7 
since the rest are obtainable from these by symmetry properties.) 


6. Factor the following Gaussian integers into primes in Z[i]: 3+ 5i, 8—i, 10 +i, 
5 — 12i, 351, —35 + 1201, 253 + 2041. 


7. (a) Show that if an odd prime p factors in Z[w] for w = (1+/—3)/2 then it factors 
in Z[/—3]. 

(b) Do the same with —3 replaced by —7. 

(c) Show that this no longer holds when —3 is replaced by —11. 
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8. (a) Determine how the number 2 factors into primes in R, for A = —3, —4, -7, —8, 
—11,-12,-15, and —16. 
(b) Do the same for A = 5,8, and 12. 


9. Show that if an element « in R, is prime then so is &. 


10. (a) Find a number 7 that has exactly two different factorizations into primes in 
Z[V/—6], up to multiplication by units, and find another number that has exactly three 
such factorizations. 

(b) Do the same for Z[,/10] where things are slightly more complicated since there 
are many more units. 


11. Show that the factorization of a prime p in Z into primes in R, is always unique 
up to units. (See Propositions 6.16 and 8.4.) 


8.2 Unique Factorization via the Euclidean Algorithm 


The main goal in this section is to show that unique factorization holds for the 
Gaussian integers Z[i] and in a few other cases as well. The plan will be to see that 
Gaussian integers have a Euclidean algorithm much like the Euclidean algorithm in Z, 
then deduce unique factorization from this Euclidean algorithm. 

In order to prove that prime factorizations are unique we will use the following 
special property that holds in Z and in some of the rings Ry, as well: 


(x) Ifa prime p divides a product ab then p must divide either a or b. 


One way to prove this for Z would be to consider the prime factorization of ab, which 
can be obtained by factoring each of a and b into primes separately. Then if the prime 
p divides ab, it would have to occur in the prime factorization of ab, hence it would 
occur in the prime factorization of either a or b, which would say that p divides a 
or b. 

This argument assumed implicitly that the prime factorization of ab was unique. 
Thus the property (*) is a consequence of unique factorization into primes. But the 
property (x) also implies that prime factorizations are unique. To see why, consider 
two factorizations of anumber n > 1 into positive primes: 


n = P\P2°**PR=4142°°* 4 


We can assume k < l by interchanging the p,’s and q;’s if necessary. We want to 
argue that if (x) holds for each p;, then the q;’s are just a permutation of the p;’s 
and in particular k = l. The argument to prove this goes as follows. Consider first 
the prime p,. This divides the product q,(q»---q),) so by property (*) it divides 
either q, Or 424; +--+ q). In the latter case, another application of (*) shows that pı 
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divides either q4 or q344---q,. Repeating this argument as often as necessary, we 
conclude that p, must divide at least one q,. After permuting the q,’s we can assume 
that p, divides qı. We are assuming all the p;’s and q;’s are positive, so the fact that 
the prime p, divides the prime q, implies that p, equals q,. We can then cancel 
pı and q; from the equation p)P»--- Px = 4142' `q to get P2: >t Pe = 42°--41- 
Now repeat the argument to show that p, equals some remaining q; which we can 
assume is qp after a permutation. After further repetitions we eventually reach the 
point that the final p, is a product of the remaining q,’s. But then since p, is prime 
there could only be one remaining q;, so we would have k = | and p} = q,, finishing 
the argument. 

If we knew the analogue of property (x) held for primes in Ry, we could make 
essentially the same argument to show that unique factorization holds in R4. The only 
difference in the argument would be that we would have to take units into account. 
The argument would be exactly the same up to the point where we concluded that p, 
divides q,. Then the fact that q, is prime would not say that p, and q, were equal, 
but only that q; is a unit times pı, so we would have an equation q, = ep, with e 
a unit. Then we would have p,po--: Py = @P142°*+q,. Canceling p, would then 
yield p2.P3°-*- Pe = €q42q3'' qı. The product eq, is prime if q is prime, so if we 
let q> = eq we would have pp; ++: Py = 4243: qı- The argument could then be 
repeated to show eventually that the q,’s are the same as the p;’s up to permutation 
and multiplication by units, which is what unique factorization means. 


Since the property (x) implies unique factorization, it will not hold in Ra when 
R, does not have unique factorization. For a concrete example consider Z[/—5]. 
Here we had nonunique prime factorizations 6 = 2-3 = (1 + /—5)(1 — V—5). The 
prime 2 thus divides the product (1 + /—5)(1 — V—5) but it does not divide either 
factor 1 + /—5 since (1 + /—5)/2 is not an element of Z[/—5]. 


Our task now is to prove the property (*) without using unique factorization. 
As we saw in Chapter 2, an equation ax + by = 1 always has integer solutions (x,y) 
whenever a and b are coprime integers. This fact can be used to show that property 
(x) holds in Z. To see how, suppose that a prime p divides a product ab. It will 
suffice to show that if p does not divide a then it must divide b. If p does not 
divide a, then since p is prime, p and a are coprime. This implies that the equation 
px +ay = 1 is solvable with integers x and y. Now multiply this equation by b to 
get an equation b = pbx +aby. The number p divides the right side of this equation 
since it obviously divides pbx and it divides ab by assumption. Hence p divides b, 
which is what we wanted to show. 

The fact that equations ax + by = 1 in Z are solvable whenever a and b are 
coprime can be deduced from the Euclidean algorithm in the following way. What the 
Euclidean algorithm gives is a method for starting with two positive integers «~, and 
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«, and constructing a sequence of positive integers «,; and f; satisfying the following 


equations: 
X = P101 + & 


Oy = P203 + & 


Xy-2 = Pn-1&n-1 t An 
Xn-1 = Br Qn + On+1 
Xn = Bn+1&n+1 
From these equations we can deduce two consequences: 


(1) &,,, divides a and a. 
(2) The equation &,,,; = &oX + &,y is solvable in Z. 


To see why (1) is true, note that the last equation implies that «,,,, divides a,. Then 
the next-to-last equation implies that «,,,, also divides «,,_,, and the equation before 
this then implies that «,,,, also divides «,,_», and so on until one deduces that «,,,; 
divides all the «,’s and in particular % and &4. 

To see why (2) is true, observe that each equation before the last one allows an 
&; to be expressed as a linear combination of a;_; and o«;_». Then by repeatedly 
substituting in, one can express each «; in terms of &) and a, asa linear combination 
XQ) + Ya, with integer coefficients x and y. In particular «,,,, can be represented 
in this way, which says that the equation «,,,; = & xX + &,y is solvable in Z. 

Now if we assume that & and «, are coprime then «,,,; must be 1 by (1), and 
by (2) we get integers x and y such that «x + «,y = 1, as we wanted. 

Putting all the preceding arguments together, we see that the Euclidean algorithm 
in Z implies unique factorization in Z. 

Avery similar argument works in R, provided that one has a Euclidean algorithm 
to produce the sequence of equations above starting with any pair of nonzero elements 
&o and «, in R4, with all the numbers «a; and f; now being elements of Ra. The 
statements (1) and (2) again follow from these equations, with the equation in (2) now 
being solvable with x and y in R,. For the application to unique factorization the 
coefficients & and «, will be coprime in the sense that their only common divisors 
are units, so «,,, Will þe a unit. A solution of «,,,; = &)x+,y can then be modified 
by multiplying x and y by azl to get a solution of 1 = œx + &ı Y . By the argument 
given before, this is all that is needed to imply unique factorization in R4. 

Let us show now that there is a Euclidean algorithm in the Gaussian integers Z[i]. 
The key step is to be able to find, for each pair of nonzero Gaussian integers &ọ and 
«,, two more Gaussian integers 6, and a, such that & = B, a, + & with a, being 
“smaller” than «,. We measure “smallness” of complex numbers by computing their 
distance to the origin in the complex plane. For a complex number « = x + yi this 
distance is \/x2 + y2. Here x? +y? is just the norm N(«) when x and y are integers, 
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so we could measure the size of a Gaussian integer « by N(a). However it is simpler 
just to use N(«) without the square root, so this is what we will do. 

Thus our goal is to find an equation &%) = 6, &; +) with N(a) < N(a,), starting 
from two given nonzero Gaussian integers &) and «,. If we can always do this, then 
by repeating the process we can construct a sequence of «;’s and $;’s where the 
successive &;’s have smaller and smaller norms. Since these norms are nonnegative 
integers, they cannot keep decreasing infinitely often, so eventually the process will 
reach an &; of norm 0, hence this œ; will be 0 and the Euclidean algorithm will end 
in a finite number of steps, as it should. 

The equation & = 1 &ı + &ə is saying that when we divide œ; into &), we obtain 
a quotient 6, and a remainder «,. What we want is for the remainder a, to have a 
smaller norm than «,. To get an idea how we can do this let us look instead at the 
equivalent equation 


If we were working with ordinary integers, the quotient 8, would be the integer part 

of the rational number &)/«, and «,/«, would be the remaining fractional part. For 

Gaussian integers we do something similar, but instead of taking £} to be the “integer 

part” of &,/o, we take it to be a Gaussian integer of minimum distance to &)/,. 
As an example let us take & = 12 + 15i and a, = 5 + 2i. Then: 


& 124151 (12+15i)(5—-2i) 904+51i ng 3-7i Ap 
SS i a‘ HOO = —_ _ = (3 Hai) +: H = 8, + M 
542i (5+ 2i)(5 — 2i) Bg; OT a Pi ag 
Here in the last step we choose 3 + 2i as B, because 3 is the closest integer to 99/9 
and 2 is the closest integer to °//29. Having found a likely candidate for B,, we can 


use the equation &) = f1 Qı + & to find a: 
12 + 15i = (3 + 21)(5 + 2i) + a 
=(11+16i)+ a, hence a,=1-1 
Since N(1 — i) = 2 and N(5 + 2i) = 29 we have N(a,) < N(a,) as we wanted. 
In fact choosing f; as a closest Gaussian integer to the “Gaussian rational” Xp / 0, 
will always lead to an œ, with N(&) < N(«,). This is because if we write the quotient 
&,/&, inthe form x + yi for rational numbers x and y (so in the example above we 


have x = 3⁄29 and y = ~%/o9) then for B, to be a Gaussian integer closest to &/ 0, 
means that |x| < > and |y| < V», and therefore: 


(3) =x? +y? <y+Yy<l 
_ arf 2 _ yr ( &2 
and hence N() = Nae . 0%) = N (qe) N) < N(Q,) 


Thus we have N(«,) < N(«,). This shows that there is a general Euclidean algorithm 
in Z[i], and so Z[i] has unique factorization. 
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Just as an exercise let us finish carrying out the Euclidean algorithm for a) = 
12 + 15i and «, = 5 + 2i. The next step is to divide œ, = 1 — i into a, =5+4 21: 


54+ 21 (5 + 21)(1 + 1) 3+71 , 1+1 
= — > n 1 
kei ARU 2 Cee 


Notice that the fractions 3⁄2 and “> are exactly halfway between two consecutive 
integers, so instead of choosing 1 + 3i for the closest integer to 3+7/% we could 
equally well have chosen 2 + 3i or 1+4i or 2 + 4i. Let us stick with the choice 1 + 3i 
and use this to calculate the next «;: 


5+ 21 = (14+ 31)(1 — 1) + & = (44 2i) + & 
hence «3 = 1. The final step would be simply to write 1 — i = (1 —i)1 +0. Thus the 
full Euclidean algorithm gives the following equations: 
124+ 157 = (3 + 21)(5 + 21) + (1 - 1) 
5+ 21=(14+3i)(1-i1)4+1 
1-i=(1-i)(1) +0 

In particular, since the last nonzero remainder is 1, a unit in Z[i], we deduce that this 
is the greatest common divisor of 12 + 15i and 5 + 2i, where “greatest” means “of 
greatest norm”. In other words 12 + 15i and 5 + 2i have no common divisors other 
than units. (This also follows from the fact that the norms N(12 + 15i) = 9-41 and 
N(5 + 2i) = 29 are coprime.) 


The equations that display the results of carrying out the Euclidean algorithm can 
be used to express the last nonzero remainder in terms of the original two numbers: 


1 = (54 21) — (14+ 3i)(1 —- 1) 
(54+ 21) —- (14+ 3i)[(12 + 151) — (3 + 21)(5 + 2i)] 
—(1 + 31)(12 + 151) + (—2 +111)(5 + 21) 


If it had happened that the last nonzero remainder was a unit other than 1, we could 
have expressed this unit in terms of the original two Gaussian integers, and then 
multiplied the equation by the inverse of the unit to get an expression for 1 in terms 
of the original two Gaussian integers. 


Having shown that prime factorizations in Z[i] are unique, let us see what this 
implies about the representation problem for the norm form x? + y*. The equation 
x*+y? =n canbe written as (x+yi)(x—yi) =n. If the prime factorization of x+ yi 
in Z[i] is x + yi = a - +--+ a, and the prime factorization of n in Z isn = pı -Pm 
then the equation x° + y? = n becomes 0, -QA = Pı’ Pm- A prime p 
in Z either splits as a product œ of two primes in Z[i] or remains prime in Z[i]. 
Unique prime factorization means that, up to units, the factorization n = a, -œ 
is obtained from the factorization n = p,---p,, by replacing each p, that splits by 
a product «;«;. Each prime p that does not split will be equal to some a; or Qj, 
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but in this case both factors œ; and &; are integers so they are equal. This means 
that the two factors œ; and &; give two factors of the product p; -Pm that are 
the same nonsplit prime. Thus nonsplit primes must occur to an even power in n. 
Conversely, if the nonsplit prime factors of n occur only to even powers then we 
obtain a factorization n = Q&Q} - - - x/&, and hence a solution of x* + y* = n with 
x+yi=Q >: A]. 

Thus we see that the equation x* + y° = n has an integer solution (x, y) exactly 
when each nonsplit prime factor p of n occurs with an even exponent in n. The split 
primes are the primes represented by x? + y?, so these are 2 and primes p = 4k +1 
as we saw in Chapter 6. Hence the numbers expressible as the sum of two squares 
are the numbers in which each prime factor p = 4k +3 occurs to an even power. This 
agrees with the answer we got in Chapter 6, but the only results from that chapter we 
have used here are the fact that all primes p = 4k +1 are represented by x° + y* and 
the easy facts that 2 is represented but primes p = 4k + 3 are not represented. 


Going further, we can also answer the more subtle question of when the equation 
x? + y* =n has a solution with x and y coprime. For x and y not to be coprime 
means they are both divisible by some prime p, which is the same as saying that 
x + yi is divisible by p in Z[i], or we could equally well say x — yi instead of x + yi. 
If a prime factor p of n in Z does not split in Z[i] then p will be prime in Z[i] soin 
the factorization n = (x + yi)(x — yi) we will have p as a prime factor of x + yi or 
x-— yi in Z[i],so x and y will not be coprime. Thus n must be a product of primes 
that split in Z[i]. If one of these primes splits as p = œw then we cannot have both 
& and & as two of the factors of x + yi, otherwise p would divide x + yi. Thus if 


K as a factor of x + yi and a asa 


p appears to the kth power in n, we must have « 
factor of x — yi or vice versa, at least when « and & do not differ just by a unit. If « 
and & differ just by a unit then we must have k = 1, otherwise x + yi would have p 
as a factor. We noted earlier that 2 is the only prime in Z that splits as a product of 
two primes in Z[i] that differ just by a unit, so the final result is that x° + y? = n has 
a solution with x and y coprime exactly when n = 2“p, --- py with each p; a prime 


congruent to 1 mod 4 and a < 1. This too agrees with what we showed in Chapter 6. 


An advantage of using Gaussian integers to determine the numbers represented 
by x° + y? is that this gives a way of computing explicitly all the representations of a 
given number n. Computing the topograph does this, but the amount of work needed 
increases rapidly as n gets larger since one is computing the representations of all 
numbers smaller than n at the same time. To illustrate how Gaussian integers speed 
things up for specific values of n let us see how to find all the primitive solutions of 
x+y? = 5%, For k = 1 we have the solution (x,y) = (2,1) corresponding to the 
factorization 5 = (2 + i)(2 — i), so a primitive solution for arbitrary k is obtained by 
expressing (2 + i)* as x + yi. One could use the binomial theorem for this, but this 
would involve computing binomial coefficients, so let us instead proceed by induction 
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on k using the formula (x +yi)(2+i) = (2x-y)+(x+2y)1. This yields the following 
sequence of pairs (x,y) for k = 1,2,3,4,5,6,7,8: 


(2,1), (3,4), (2,11), (—7, 24), (—38, 41), (-117, 44), (—278, —29), (—527, —336) 


The signs are irrelevant for solutions of x*+* = 5* but they cannot be ignored when 
computing with the inductive formula. For each k there are exactly eight primitive 
solutions, corresponding to (2 + i)* and (2 — i)* along with multiples of these by 
each of the four units +1,+i. In terms of x and y these are the groups (+x,+y) 
and (+y,+x). In the topograph of x° + y* the value 5* will appear just once in each 
quadrant since each pair of solutions (x,y) and (—x,—y) determines the same frac- 
tion x/y. This was guaranteed to happen by Proposition 6.16 which states that any 
two occurrences of the same prime power in a topograph are related by a symmetry 
of the topograph, for primes not dividing the conductor, and the conductor here is 1. 


For negative discriminants it is not difficult to figure out exactly when Ry, has a 
Euclidean algorithm. Recall that this means that for each pair of nonzero elements 
&) and a, in Ra there should exist elements f and o such that & = Ba, + a 
and N(a>) < N(a,). Since &, is determined by a, «,, and $, this is equivalent to 
saying that there should exist an element f in R, such that N(&o — Ba,) < N(a;). 
The last inequality can be rewritten as N (%0/x, —B) < 1. Geometrically this is saying 
that every point “O/x, in the plane should be within a distance less than 1 of some 
point £ in the lattice R4. We can check this by seeing whether the interiors of all the 
circles of radius 1 centered at points of Ra completely cover the plane. 

For Z[/D] with D < 0 the critical case D = —3 is shown in 
the figure at the right, where the triangle is an equilateral trian- 
gle of side length 1. Here the four circles of radius 1 centered 
at 0, 1, V—3, and 1 + V—3 intersect at the point (1 + /—3)/2 
so this point is not within distance less than 1 of an element of 
Z[/—3] and therefore the Euclidean algorithm fails in Z[V-3]. 

For D < —3 the lattice Z[VD] is stretched vertically so the Eu- 
clidean algorithm fails in these cases too. For D = —2 the lattice is compressed 
vertically so Z[,/—2] does have a Euclidean algorithm. 

In the case of Z[w] with w = (1+ /1+4d)/2 and d < 0 
the upper row of disks is at height y|1 + 4d]|/2 above the 
lower row, so from the figure we see that the condition we 
need is that this height should be less than 1 + A Thus we 
need y|1 + 4d] < 2+Vv3. Squaring both sides gives |1+4d| < 
7 + 4V3 which is satisfied only in the cases d = —1, —2, —3. 


In summary, we have shown the following result: 


Proposition 8.9. The only negative discriminants A for which R, has a Euclidean 
algorithm are A = -3, —4, -7, —8, —11. 
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Notice that these are the first five negative discriminants. 


For even discriminants A = 4D it is easy to prove that unique factorization fails 
in Ra = Z[VD] in all cases when A is negative and there is no Euclidean algorithm: 


Proposition 8.10. Unique factorization fails in Z[,/D] whenever D < —2, and it 
also fails when D > 0 and D = 1 modulo 4. 


Proof: The number D? — D factors in Z[VD] as (D + VD)(D - VD), and it also 
factors as D(D — 1). The number 2 divides either D or D — 1 since one of these 
two consecutive integers must be even. However, 2 does not divide either D + VD or 
D — V/D in Z[VD] since (D + VD)/2 is not an element of Z[,/D] as the coefficient of 
VD in this quotient is not an integer. If we knew that 2 was prime in Z[VD] we would 
then have two distinct factorizations of D* — D into primes in Z[,/D]: one obtained 
by combining prime factorizations of D and D — 1 in Z[,/D] and the other obtained 
by combining prime factorizations of D + VD and D — VD. The first factorization 
would contain the prime 2 and the second would not. 

It remains to check that 2 is a prime in Z[/D] in the cases listed. If it is not 
a prime, then it factors as 2 = af with neither œ nor f a unit, so we would have 
N(a) = N(B) = +2. Thus the equation x° — Dy? = +2 would have an integer 
solution (x, y). This is clearly impossible if D = —3 or any negative integer less than 
-3. If D> 0 and D = 1 modulo 4 then if we look at the equation x° — Dy* = +2 
modulo 4 it becomes x° — y? = 2 mod 4, but this is impossible since x* and y? are 
congruent to 0 or 1 modulo 4, so x? = y? is congruent to 0, 1, or —1. m 


This proposition says in particular that unique factorization fails in Z[/—3], 
Z[/—7], and Z[,/—11]. However, when we enlarge these rings to Z[w] for w equal 
to (1 + /—3)/2, (1 + V—7)/2, and (1 + V—I1)/2 we do have unique factorization. 
A similar thing happens when we enlarge Z[,/—8] to Z[/—2]. In all these cases the 
enlargement replaces a nonfundamental discriminant by one which is fundamental. 

One might wonder whether there are other ways to enlarge Z[VD] to make prime 
factorization unique when it is not unique in Z[VD] itself. Without changing things 
too drastically, suppose we just tried a different choice of w. In order to do multi- 
plication within the set Z[w] of numbers x + yw with x and y integers one must 
be able to express w? as mw + n, which means that w must satisfy a quadratic 
equation w° — mw -n = 0. This has roots (m + Vm? +4n)/2, so we see that 
larger denominators than 2 in the definition of w will not work. If m is even, say 
m = 2k, then w becomes k + vk? +n, with no denominators at all and we are 
back in the situation of a ring Z[/D]. If m is odd, say m = 2k + 1, then w becomes 
(2k+1+4V4k2 + 4k +1 + 4n)/2 which equals k+ (1+V1 + 4d)/2 for d = k*+k+n so 
the ring Z[w] in this case would be the same as when we chose w = (1+ vI +4d)/2. 
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It is known that there are only nine negative discriminants for which R, has 
unique factorization, the discriminants 


A = -3, —4, -7, —8, -11, -19, —43, -—67, -163 


These are exactly the nine negative discriminants for which all quadratic forms of 
that discriminant are equivalent. This is not an accident since the usual way one 
determines whether unique factorization holds is by proving that unique factorization 
holds precisely when all forms of the given discriminant are equivalent, as we will see 
later in the chapter. This is for negative discriminants. For positive discriminants the 
condition is that all forms are equivalent to either the principal form or its negative. 


For positive discriminants the norm form is hyperbolic so it takes on both pos- 
itive and negative values. The Euclidean algorithm is then modified so that in the 
equations &;_, = fiQ; + Qi, itis required that |N(a;,,)| < IN (&;)|. It is known that 
there are exactly 16 positive fundamental discriminants for which there is a Euclidean 
algorithm in R4: 

A = 5,8, 12, 13,17, 21, 24, 28, 29, 33,37, 41, 44, 57, 73, 76 
The determination of this list is quite a bit more difficult than for negative discrimi- 
nants since the norm no longer has the nice geometric meaning of the square of the 
distance to the origin in the plane. 

There are many positive fundamental discriminants for which R, has unique 
factorization even though there is no Euclidean algorithm. The fundamental discrim- 
inants less than 100 with this property are 53,56, 61, 69, 77, 88, 89, 92, 93,97. 


To conclude this section we give two applications of unique factorization to 
quadratic forms. The first will be to find all primitive solutions of x*+7y? = 2*. This 
equation came up in Section 6.2 when we were considering which powers of a prime 
that divides the conductor for a given nonfundamental discriminant are represented 
by primitive forms of that discriminant. For the form x* + 7y? the discriminant is 
—28 with class number 1 and conductor 2 so the question was which powers of 2 are 
represented by x? + 7y*. Obviously 2 and 2° are not represented, but we showed 
that all powers 2* with k > 3 are represented. However the method there did not 
produce actual primitive solutions of x? + 7y* = 2* so that is what we will find here. 

The form x°? + 7y? is the norm form in Z[/—7] so we are looking for elements 
x+yV/—7 of ZIV-7] of norm x? +72 = 2* with x and y coprime. The ring Z[/—7] 
does not have unique factorization, so we will enlarge it to Z[w] for w = (1+ /—7)/2 
since Z[w] does have unique factorization. The only units in Z[w] are +1 so prime 
factorizations are unique up to signs. 

We have N(w) = wW = 2 so N(w*) = 2*. The prime factorization of 2K in 
Z[w] is 2* = w*w* so the elements of Z[w] of norm 2* are, up to sign, the products 


wW” with 1+m =k. We need to determine which of these products lie in Z[V-7] 
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and are primitive, that is, not an integer multiple of another element of Z[,/—7] unless 
that integer is +1. 

Consider first the case m = 0. If w* is an element a + b\/—7 of Z[V—7] then 
the norm equation a? + 7b* = 2* implies that a and b have the same parity. If they 
are both even then w* would be divisible by 2 in Z[V-7] and hence also divisible 
by 2 in Z[w], but this is impossible since 2 factors as ww and W is not one of the 
prime factors of w* since W + +w. If a and b are both odd then w* is 2 times an 
element of Z[w] and we have the same contradiction. Thus we must have m > 0, 
and similarly we must have l > 0. 

If m = 1 then we are considering the product w*~!@ which equals 2w*~?. This 
is twice an element of Z[w] so it lies in Z[,/—7] and can be written as x + y/—7 for 
some integers x and y. If x and y are not coprime, they are divisible by some prime 
p which must be 2 since odd primes do not divide 2w*~? in Z[w], as N(2w*~?) = 2%. 
This leaves the possibility that x and y are both even. If this is the case then we can 
k-2 _ x + y./—7 to get w*? as an 
element of Z[,/—7], whichis impossible if k > 3 as we saw in the preceding paragraph. 


cancel a 2 from both sides of the equation 2w 


Thus we conclude that x + y./—7 = 2w*~? gives a primitive solution of x? +7y° = 2* 
when k = 3. Similarly, if l = 1 we would obtain the conjugate solution x — yV/—7, 
just changing the sign of y. 

There remains the possibility that both l and m are greater than 1. In these 
cases w't™ would be divisible by 4, giving an element x + y/—7 of Z[V—7] with 
x and y even, so we would not get a primitive solution of x? + 7y? = 2*. 

Thus we have shown that there are exactly four primitive solutions of x* + 7° 
for each k = 3, differing only in the signs of x and y so there is a unique primitive 
solution with x and y positive. We can compute this solution by computing 2wk-* 
as an element x + y./—7. This can be done inductively using the formula: 


(a + b(A) = aa 7h) + las BT 


Here is a table of these values for k < 15: 


k 3 4 5 6 7 8 9 10 


11 12 13 14 15 
(-5,—17) (57,—11) (67,23) (47,45) (—181,-1) 


Omitting the minus signs gives the positive primitive solutions. However, if we tried 
to simplify the calculations by omitting the minus signs at each step, this does not 
work since for example if we use the solution (3,1) for k = 4 instead of (—3,1) in 
the formula ((a — 7b) + (a + b)/—7)/2, this would give the nonprimitive solution 
(—2,2) for k = 5 instead of (—5,-1). 
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This problem has some history. In the early 1900s the number theorist Ra- 
manujan observed that the Diophantine equation x* + 7 = 2* has solutions for 
k = 3,4,5,7,15 and he conjectured that there were no solutions for larger k. In 
terms of the preceding example this is saying that the only solutions of x? +7y° = 2K 
with y = 1 occur in these five cases, so x = 1,3,5,11,181 as in the table above. 
(Note that a solution with y = 1 must be primitive.) Ramanujan’s conjecture was 
later proved in a paper by Skolem, Chowla, and Lewis published in 1959. 


For the other application of unique factorization we consider the forms x? +18y° 
and 2x* +9y° of discriminant —72. The class number here is 2 and these forms are 
in the two classes. The discriminant —72 is not fundamental since —72 = 3°(—8) 
with —8 a fundamental discriminant, so the conductor is 3. This leads us to ask 
which powers of 3 are represented by the two forms. Neither form represents 3 and 
only the second form represents 9, but both forms represent 27, coincidentally when 
(x,y) = (3,1) in both cases. 

As in the preceding example we will enlarge the ring Z[,/—18], which is R, for 
A = —72, to the corresponding ring Z[,/—2] which is R, for A = —8, in order to 
take advantage of the fact that Z[,/—2] has unique factorization while Z[/—18] does 
not. Note that /—18 = 3-2 so Z[,/—18] is contained in Z[/—2] as the numbers 
a+ 3b/-2. 

First we consider the form x* + 18y* = N(x + 3yV—2) so we are looking for 
elements a + 3b./—2 of ZV—18] of norm 3* with a and b coprime. An element of 
Z[/—2] of norm 3 is 1 + V—2, so (1 + V—2)* has norm 3*. However (1 + /—2)* 
does not lie in Z[V—18], for suppose (1 + /—2)* = a + 3bV—2 for some integers a 
and b. Taking norms, we would then have 3* = a? + 18b*. This implies 3 divides 
a, hence 3 divides (1 + /—2)* = a + 3b\/—2 in Z[,/—2], but this is impossible since 
the prime factorization of 3 in Z[/—2] is (1 + /—2)(1 — V—2) and 1 — v=? is nota 
prime factor of (1 + V—2)*. 

To get an element of Z[/—18] of norm 3* we now try 3(1 + /—2)*~* which has 
this norm and lies in Z[/—18] since it is 3 times an element of Z[,/—2]. Thus we 
can write 3(1 + /—2)*? = a+ by—IB8 for some integers a and b. To check whether 
a and b are coprime we note first that by taking norms we see that the only prime 
that could divide a and b is 3. If 3 does divide a and b we can divide the equation 
3(1 + /—2)*-* = a + bV—18 by 3 and deduce that (1 + /—2)*~? is an element of 
Z[—18], but we saw in the preceding paragraph that this is not the case if k > 3. 
Thus we have a solution of x° + 18y* = 3* with coprime integers x and y for each 
k23. 

Now we turn to the form 2x? + 9y°. The starting point here is the observation 
that if we restrict the form x° + 18y? to pairs (x,y) with x even, then we have 
(2x)? + 18y° which is just 2(2x* + 9y°), or twice the form 2x° + 9y*. Thus we are 
looking for elements 2x + y/—18 of Z[/—18] of norm 2-3* with x and y coprime. 
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A reasonable guess might be /—2 - 3(1 + ./—2)*-* which has norm 2-3*. This lies in 
Z[V—18] since it is 3 times an element of Z[/—2] so we can write it as a + b\/—18. 
A prime dividing a and b must divide the norm 2-3* so it must be 2 or 3. If 2 
divided a and b then 4 would divide the norm so this is impossible. If 3 divided a 
and b then after canceling this 3 we would have /—2(1 + /—2)*~* being an element 
of Z[V—18], but this is impossible by the same argument that showed (1 + /—2)* 
was not in Z[,/—18]. Thus a and b are coprime. It remains only to check that a is 
even, but this is immediate from the norm equation a° + 18b? = 2-3*. 

These arguments show that all the powers 3* with k > 3 are represented by both 
x? + 18y? and 2x* + 9y°. This sort of behavior, with nonequivalent forms of the 
same discriminant representing the same prime powers, can only happen for nonfun- 
damental discriminants, and then only for powers of primes dividing the conductor, 
as we know from Chapter 6. 

The trick of realizing 2x° + 9y* as a multiple of the form obtained by restricting 
the norm form x° + 18y° to certain values of x and y in Z[/—I8] is in fact part of 
a general pattern that will be explored in the next section. 


Exercises 


1. (a) According to Proposition 8.10, unique factorization fails in Z[VD ] when D = —3 
since the number D(D — 1) = 12 has two distinct prime factorizations in Z[VD]. On 
the other hand, if we enlarge Z[V-3] to Z[w] for w = (1 + V—3)/2 then unique 
factorization is restored. Explain how the two prime factorizations of 12 in Z[,/—3] 
give rise to the same prime factorization in Z[w] (up to units). 


(b) Do the same thing for the case D = —7. 


2. Show that the number 8 has two different prime factorizations in Z[/—7], one 
with three prime factors and the other with two prime factors. 


3. In R, for A = —3 show that the only primes « for which @ is a unit times « are 
/-3 and units times /-3. 


4. In this problem we consider Z[/—2], so elements of Z[/—2] are sums x + yv-—2 
for integers x and y, with N(x + yV—2) = (x + yV—2)(x — yV—2) = x? +2y°. 


(a) Draw the topograph of x? +2y° including all values less than 70 (by symmetry, it 
suffices to draw just the upper half of the topograph). Circle the values that are prime 
(prime in Z, that is). Also label each region with its */, fraction. 


(b) Which primes in Z factor in Z[/-—2]|? 
(c) Using the information in part (a), list all primes in Z[,/—2] of norm less than 70. 


(d) Draw a diagram in the xy-plane showing all elements x + yV—2 in Z[v-2] of 
norm less than 70 as small dots, with larger dots or squares for the elements that are 
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prime in Z[/—2]. (There is symmetry, so the primes in the first quadrant determine 
the primes in the other quadrants.) 


(e) Show that the only primes x + yv-—2 in Z[/-—2] with x even are +v-—-2. (Your 
diagram in part (d) should give some evidence that this is true.) 


(£) Factor 4 + /—2 into primes in Z[v-2]. 
(g) Use the unique factorization property in Z[,/—2] to determine which numbers are 
represented by the form x° + 2“, as was done in the text for x° + y°. 


5. Following the two examples at the end of this section, find primitive solutions of 
x? +187? = 3* and of 2x? + Qy* = 3* for k = 3,4,5,6,7,8. 


8.3 The Correspondence Between Forms and Ideals 


So far in this chapter we have focused on principal forms, and now we begin 
to extend what we have done to arbitrary forms. For principal forms we began by 
factoring them as a product of two linear factors whose coefficients involved square 
roots, for example the factorization x° — Dy? = (x + VDy)(x — VDy) in the case of 
discriminant A = 4D. Fora general form Q(x, y) = ax? +bxy +cy° of discriminant 
A the corresponding factorization is a(x — ay)(x — &y) where « is a root of the 
quadratic equation ax* + bx +c = 0. Thus we have: 


ade en 


2 2 
+ bxy + = — 
ax XY +CY a(x 7 7 


An equivalent equation that will be more convenient for our purposes is obtained by 
multiplying both sides of the preceding equation by the coefficient a: 


b+ VA b- VA 
5 y)(ax + 5 y) 


alax? +bxy +cy?) = (ax ¥ 


The advantage of writing the equation this way is that in each of the two linear factors 
on the right the coefficients of x and y now lie in the ring R, since b must have 
the same parity as A. Thus if A = 4D we can eliminate the denominator 2 in the 
coefficient of y to obtain an element of Z[VD] while if A = 4d +1 the fraction lies in 
Z[w] since b is odd. Another thing to observe is that the right side of the equation is 
just the norm N (ax +2 biva Ay), so the displayed equation above can be written more 
concisely as aQ (x, y) = Nias + ca 


For a form Q(x, y) = ax? +bxy +cy° the set of numbers ax + bts 


BEIA j as x and 
y range over all integers forms a lattice contained in the larger lattice R a in the plane. 
Here by a lattice we mean a set of numbers of the form ox + By for fixed nonzero 
elements « and $ of Ry, with x and y varying over Z, and we assume that « and £ 


278 Chapter 8 — Quadratic Fields 


do not lie on the same line through the origin. We denote this lattice by L(«, 6) and 
call & and £ a basis for the lattice. 
FAO+28 
r “Ix +28 7 *2a+B 


4 30-26 
e ~Q é = 
20-8 "> &-2ß 2a 2P 


In particular, associated to the form Q we have the lattice Lo = L(a, BVA) con- 


sisting of all the numbers ax + an y for integers x and y. The earlier equation 


aQ(x,y) =N (ax 42 — y) then says that the form Q is obtained from the lattice 


Lo by taking the norms of all its elements and multiplying by the constant factor ly, 
which can be regarded as a sort of normalization constant as we will see in more detail 
later. 

Let us look at some examples to see what Lo can look like in the case A = —4 so 
R, = Z[i], the Gaussian integers. In this case we have ax + bia y =ax+(b'+i)y 
where b’ = ¥/y is an integer since b always has the same parity as A. For the principal 
form x° + y* we have a = 1 and b’ = 0 so Lg = L(1,i) = Z[i]. Four more cases are 


shown in the figures below. 


2 2 . 2 2 . 
2x +2xy + y^ — L(2,1 +i) 5x +4xy + y^ — L(5,2 +1) 
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In each case the lattice forms a grid of squares, rotated and expanded from the square 
grid formed by Z[i] itself. Not all lattices in Z[i] form square grids since for example 
one could have a lattice of long thin rectangles such as L(10, i). 

A 90 degree rotation of the plane about the origin takes a square lattice to itself. 
Conversely, a lattice L that is taken to itself by a 90 degree rotation about the origin 
must be a square lattice. To see this, observe first that the 90 degree rotation takes a 
point « of L that is closest to the origin to another point f of L closest to the origin, 
with the sum « + f giving the fourth vertex of a square of lattice points. The lattice 
L(a@, B) is then a square lattice contained in L. In fact we must have L = L(a, B), for if 
there were a point of L in the interior of a square of L(a, 6B) then such a point would 
be closer to a corner of the square than the length of the side of the square, which 
is impossible since the minimum distance between any two points in a lattice equals 
the minimum distance from the origin to a lattice point. 

Since 90 degree rotation is the same as multiplication of complex numbers by i, 
we could also say that square lattices are those that are taken to themselves by mul- 
tiplication by i. Once a lattice has this property, it follows that multiplication by an 
arbitrary element of Z[i] takes the lattice into itself. Namely, if we know that ix is 
in a lattice L whenever « is in L, then for arbitrary integers m and n it follows that 
ma and nix are in L and hence (m+ ni)« isin L. 


There is a standard term for this concept. A lattice L in R, is called an ideal if 
for each element « in L and each £ in R, the product fa is in L. In other words, 
L is taken to itself by multiplication by every element of Ra. The term “ideal” may 
seem like an odd name, but it originally arose in a slightly different context where it 
seems more natural, as we will see later in the chapter. For now we can just imagine 
that ideals are the best kind of lattices, “ideal lattices”. 

The fact that all lattices Lg in Z[i] are square lattices is a special case of the 
following general fact: 


Proposition 8.11. For each quadratic form Q = ax? + bxy +cy? of discriminant 
A the lattice Lg = L(a, B+R) is an ideal in R4. 


Proof: To cover all discriminants at once we can write R, as Z[T] for T = aa where 


e is 0 if A = 4D and 1 if A = 4d + 1. What we need to check in order to rai that 
the lattice Lo = L(a, bev piva 
are elements of Lg. For the product T-a this means we want to solve the seston 
eia p piva -y for integers x and y. Comparing the coefficients of VA on 
both sides of the udon. we get y = a, an integer. Substituting y = a into the 


equation then gives S = ax + ba SO xX = oe This is an integer since both e and b 


) is an ideal is that both of the products T-a and T- 


= AX + TZ 


have the same parity as A. 


we have an equation eta biva = ax + BA, 


btvA 


For the other product T- biva 
eb+A+(e+b)VA _ 
4 


24 


which we can rewrite as = ax + == y. From a socie of JA we 
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e+b 


get y = =~ whichis an integer since e and b have the same parity. Then the equation 
becomes +° = ax + ebb which simplifies to A = 4ax + b°. Since A = b* — 4ac 
we have the integer solution x = —c. o 


We saw in the case of Z[i] that all ideals are square lattices, so they are obtained 
from Z[i] by rotation about the origin and expansion. There are a few other negative 
discriminants where the same thing happens and all ideals differ only by rotation 
and rescaling, either expansion or contraction. One example is when A = —8 so we 
have R, = Z[,/—2] which forms a rectangular lattice with rectangles of side lengths 1 
and 2. For an arbitrary ideal L in Z[,/—2] let «œ be a nonzero point in L closest to the 
origin. Since L is an ideal, the product /—2 « must also be in L. Since multiplication 
by v=2 rotates the plane by 90 degrees and expands 
it by a factor of v2, the set of all linear combinations 
ax +./—2 ay for integers x and y forms a rectangular 
sublattice L’ of L obtained from Z[,/—2] by rotation 
and expansion. Since we chose « as the closest point 
of L to the origin, say of distance A to the origin, there 
can be no points of L within a distance less than A of 
any point of L’. In other words, if one takes the union of 
the interiors of all disks of radius A centered at points 
of L’, this union intersects L just in L’. However, this union is the whole plane 


since the ratio of the side lengths of the rectangles of L’ is /2. Thus L equals the 
rectangular lattice L’. 

This is essentially the same geometric argument we used to show that Z[,/—2] 
has a Euclidean algorithm. There were five negative discriminants A for which Ra 
has a Euclidean algorithm, A = —3, —4, —7,—-8,—-11. The argument in the preceding 
paragraph shows that in each of these cases all ideals in R, are equivalent under 
rotation and rescaling. In the case A = —3 the Eisenstein integers Z[w] form a grid 
of equilateral triangles so all ideals are also grids of equilateral triangles that are 
taken to themselves by multiplication by w, rotating the plane by 60 degrees. Two 
examples are shown below. 


3x? + 3xy+y? — L(3,1 +w) 7x? +5xy +y? — L(7,2+ w) 
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e o o e o o > o fe} - $ o o [e] o o o © fe} o 
[0] > O fo} © o o $ o fe} o o (0) fe} $ fe} o fo} fe} fe} 

0 0 

oe o o > o fe} e o fe} e fe} o © fe} o fe) o o fe} © 
fe} ° fe} o B fe} o 2 o eS fe} o fe} o o fe} D fe} o o 
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For A = —7 and —11 the lattice R, = Z[w] for A = —3 is stretched vertically to form 
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a grid of isosceles triangles and all ideals are also grids of isosceles triangles, rotated 
and rescaled from the triangles in R4. 

We have been using the fact that multiplication by a fixed nonzero complex 
number « always has the effect of rotating and rescaling the plane, keeping the origin 
fixed. Since multiplication by « sends 1 to «, the rescaling factor is the distance from 
& to the origin and the angle of rotation is the angle between the positive x-axis and 
the ray from the origin to œ. Since « can be any nonzero complex number, every 
rotation and rescaling is realizable as multiplication by a suitably chosen «a. 


Let us look at some examples of discriminants where not all forms are equivalent 
to see whether there is more variety in the shapes of the lattices Lg, so they are 
not all obtained from R, by rotation and rescaling. The examples will all be for 
negative discriminants since this is the case that the norm of an element of Ry, has 
the geometric interpretation as the square of the distance to the origin, but when we 
make general statements about lattices these will apply to both positive and negative 
discriminants. 

For a first example consider the lattices Lg in Z[/—6] for the two nonequivalent 
forms x° + 6y* and 2x* + 3y* of discriminant —24. 


x? +6y? <— L(1, V6) 2x? +3y? — L(2,/-6) 

ee ee @ @ @ @ @ © @ @ @ @ @ c-@ coc -@ © @® © -@® c-@® c-@ 

ee @ @ @ @ @ @ @ @ @ @ © @—_c—@_c_@ © @ © -@® ©c-@® c-® 

@ @ @ @ @ @ @ @ @ @ @ @ @ L o-@ coc @ © @ ©c @ cc @® coc 6 
0 0 

C S n A -@-@ @ @ @ @ @ A O a i Sn a Se a 


The two lattices do not appear to differ just by rotation and rescaling, and we can verify 
this by computing the ratio of the distances from the origin to the closest lattice point 
and to the next-closest lattice point on a different line through the origin. For the 
lattice Z[,/—6] this ratio is 1/./6 while for the other lattice it is 2/./6. If the lattices 
differed only by rotation and rescaling, the ratios would be the same. 

Instead of measuring the distances from the origin to a nearby lattice point we 
could measure the square of the distance, which is the norm of the lattice point. For 
the forms shown above we would then get the ratios Y¢ and 4/6 = 7/3. It is no accident 
that these are the ratios between the coefficients of x* and y° in the two forms since 
these coefficients give the two smallest values of the forms, which occur on either side 
of the source edge in their topographs. The norms of points in the lattice are related 
to the values of the form by the formula aQ (x, y) = N(ax + beva y), so the smallest 
norms correspond to the smallest values of the form, with the scaling factor a in the 
left side of the formula accounting for the fact that the fraction 4/6 reduces to °/3 by 
dividing numerator and denominator by a = 2. 


282 Chapter 8 — Quadratic Fields 


As another example, consider the lattices Lg in Z[v-5] for the nonequivalent 
forms x° + 5y* and 2x* + 2xy + 3y° of discriminant —20. 


x? +5y° —> L(1,V-5) 2x*+2xy+3y* —> L(2,1+V-5) 
eee Koko o Wo yo yo 
oo o oo ooo 6 0 0 0-6 owo w o Wo yo yo Ho 
oo o oo o o o o oo oo ODOT ET ok oH of 
eee ooko kooo 6 


It is clear visually that the two lattices are not related just by rotation and rescaling 
since the first lattice is rectangular while the second is not, and we can verify this by 
computing the ratios of the norms of the two closest lattice points to the origin lying on 
different lines through the origin. For the first lattice the ratio is 1⁄5 corresponding to 
the topograph having a source edge with adjacent labels 1 and 5, as in the preceding 
example. For the second lattice the points closest to the origin are +2 and +1+/—5 
with norms 4 and 6, giving a ratio 4/6 which reduces to 7/3 via the rescaling factor 
a = 2. The topograph of the second form has a source vertex surrounded by the 
labels 2,3,3 for */y = Yo, %, and ~Y,. The two 3’s correspond to the two equal 
sides of the isosceles triangles in the figure, of norm 6 which rescales to 3 


A slightly more complicated example is Z[,/—14] with A = —56 where there are 
four proper equivalence classes of forms: 


x? +14y? <> L(1,V—-14) 2x2 +7y*? <> L(2,/-14) 

eee 0 @ @ @ @ © @ © @ © C a i Sn a Se a 
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3x? + 2xy4+5y* —> L(3,14+V—14) 3x° —2xy +5y* — L(3,-14+V—14) 
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For the first two forms the ratios of smallest norms are 4 and 4/,4 = 2/7. For the 
second two forms the norms of the three sides of the triangles are 9, 15, and 18 so 
the ratio for the smaller two norms is 9/,5 = 3/5. The second two forms are equivalent 
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but not properly equivalent since their topographs have a source vertex surrounded 
by the three distinct numbers 3, 5, and 6, the rescalings of the norms 9, 15, and 18. 
The topographs of these two forms are mirror images obtained by changing the sign 
of x or y, thus changing the sign of the coefficient of the middle term xy in the 
form. The corresponding lattices are also mirror images obtained by reflecting across 
either the x-axis or the y-axis, which also amounts to changing the sign of x or y. 
These two lattices are not equivalent under rotation and rescaling, so none of the four 
lattices in this example are equivalent by rotation and rescaling. 

Recall that the three values of an elliptic form surrounding a source vertex satisfy 
the triangle inequalities, so each value is less than or equal to the sum of the other two. 
This means that for the triangles in the lattices the square of each side length is less 
than or equal to the sum of the squares of the other two side lengths. Comparing these 
inequalities with the Pythagorean theorem, this is just saying that the triangles are 
acute triangles, unless the square of one side is actually equal to the sum of the squares 
of the other two sides in which case it is a right triangle. This only happens when there 
is a source edge instead of a source vertex. In this case the grid is rectangular, with 
each rectangle subdivided into two right triangles by either of its diagonals, but there 
is no reason to choose one diagonal rather than the other so it seems best to ignore 
the diagonals and just draw the rectangles. 

As we noted above, the two lattices L(3,1 + /—14) and L(3,-1 + /—14) in 
Z{,/-14] are mirror images of each other under reflection across either the x-axis 
or the y-axis. Reflecting a lattice across the y-axis gives the same result as reflecting 
across the x-axis since lattices always have 180 degree rotational symmetry about 
the origin. Reflecting a lattice across the x-axis amounts to taking the conjugates 
of all elements of the lattice, so the reflection of a lattice L = L(a,B) is the lattice 
L = L(&, B) called the conjugate lattice. If L is an ideal it is easy to check that L is 
also an ideal, so in this case L is the conjugate ideal of L. For lattices coming from 


forms, the conjugate of L(a, bi va) is L(a, bv) which is the same as L(a, a), 


A lattice is equal to its conjugate exactly when it is symmetric with respect to 
reflection across the coordinate axes. In the example of lattices in Z[/—14] the first 
two lattices have this symmetry property while the second two do not. 


Proposition 8.12. A lattice L(a, beva) is equal to its conjugate if and only if b = 0 
mod a. These are the rectangular lattices L(a, va) with b = 0 and the isosceles 


triangle lattices L(a, ark) with b= a. 


Proof: Consider the points of a lattice L(a, ” +a) that are in the same horizontal row 


as ee. These points are equally spaced along this row at distance |a| apart. The 
lattice equals its conjugate exactly when reflection across the y-axis takes this set 
of points to itself, so the only possibilities are that the set contains the point va or 


it contains ae, Hence the lattice is either the rectangular lattice L(a, v3) or the 
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isosceles triangle lattice L(a, as) In both cases these are lattices L(a, beva) with 
biVA 


b = 0 mod a, and conversely any lattice L(a, ==) with b = 0 mod a is equal to one 
of these two lattices since L(a, 2 twa) is unchanged when multiples of a are added to 
b+v thus adding multiples of 2a to b. o 


The two types of self-conjugate lattices L(a, 1j and L(a, atv) correspond to 
the forms ax? + cy° and ax? + axy + cy° whose topographs have mirror symme- 
try. As we saw in Proposition 5.6, all forms with mirror symmetric topographs are 
equivalent to forms of these two types. 

In general, most ideals L(a, b AA) are not self-conjugate. For example in the 
Gaussian integers Z[i] all ideals are square lattices rotated and expanded from the 


full lattice Z[i], but the only ones that are vertically and horizontally symmetric are 


the ones where the angle of rotation is a multiple of 45 degrees, so these are the 
lattices Z[i] and L(2,1 + i) or rescalings of these. 


The examples we have seen so far lead one to ask how exact a correspondence 
there is between proper equivalence classes of forms of a given discriminant A and 
the shapes of lattices that are ideals in R,, where two lattices that differ only by 
rotation and rescaling are regarded as having the same shape. The main theorem 
in this section will be that this is an exact one-to-one correspondence for negative 
discriminants, while for positive discriminants there is an analogous one-to-one cor- 
respondence using a more algebraic analogue of “shape” for lattices that works for 
both positive and negative discriminants. 


Before getting to the main theorem we will first explain a few general facts about 
lattices in R4. Let us write Ry, as Z[T] for T = VD when A = 4D and T = iva when 
A =4d+1. Let L bea lattice in Z[T]. Since L is not entirely contained in the x-axis 
there exist elements m + nT in L with n > 0. Choose such an element « = m + nT 
with minimum positive n, so « lies in the nt” row of Z[t] and there are no elements 
of L in any row between the 0” and the nt” rows. Since L is a lattice all elements of 
L must then lie in rows numbered an integer multiple of n. In particular the element 
ka lies in the kn!” row for each integer k. These elements ka lie on a line through 
the origin, and L must also contain elements not on this line, so some kn” row must 
contain another element f of L besides ka. The difference 6 — ka then lies in the 
x-axis and is a nonzero integer in L. Choosing a minimal positive integer p in L, the 
lattice property of L implies that the integers in L are precisely the integer multiples 
of p. It follows that L contains the lattice L(p,«~) = L(p,m + nT), and in fact L is 
equal to L(p,m + nT) otherwise either p or n would not be minimal. We are free to 
change m by adding or subtracting any integer multiple of p without affecting the 
lattice, so we may assume 0 < m < p. 

Thus we see that every lattice L in Z[T] has a basis of the special type p,m + nT 
for p and n positive integers and m an integer in the range 0 < m < p. Such a 
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basis is called a reduced basis. A reduced basis for a lattice L is unique since p is 
the smallest positive integer in L and the first row of L above the x-axis is in the 
n!” row of Z[tT], with the elements of L in this row equally spaced p units apart so 
there is a unique such element m + nT with 0 < m < n. Thus one can tell whether 
two lattices in Z[T] are equal by finding a reduced basis for each lattice and seeing 


whether these reduced bases are equal. 


Let us describe how to compute a reduced basis for a lattice L(«,, œ>) where 
&,, > is an arbitrary given basis. There are three simple ways to change from one 
basis to another basis for the same lattice: 


(1) Replace one œ; with a; + ka,;, adding an integer k times the other basis el- 
ement &; to &;. Geometrically this changes the parallelogram with vertices 
0, &1, &, &, + & to a parallelogram with one side the same, the side from 0 
to «;, but the opposite side with ends «œ; and «; + œ; is translated along the line 
containing it. 

(2) Replace one a; by —«;. 


(3) Interchange o, and a>. 


These operations on bases can be interpreted as operations on matrices if we let 
&, = a, +b,T and & = a + byt and then consider the matrix A = (5: i The 
operation (1) changes one column of A by adding k times the other column to it. 
Operation (2) multiplies one column by —1, and operation (3) interchanges the two 
columns. The goal is to use these three operations to change the given matrix A toa 
matrix of the special form CG r ) with a and c positive and with 0 < b < a, so this 


will be the matrix of a reduced basis. 

First we focus on the second row of A. This must have a nonzero entry since o, 
and &, are not both contained in the x-axis. The nonzero entries in the second row 
can be made positive by type (2) operations. If both b; entries are positive choose 
a column with smallest positive entry b;. By subtracting a suitable multiple of this 
column from the other column we can make the other column have its entry b; satisfy 
0 < b; < b;. This process can be repeated using columns with successively smaller 
second entries until only one nonzero b; remains. Switching this column with the 
first column if necessary, we can then assume that b, = O and b, > 0. Then a, 
must be nonzero, and if it is negative we can make it positive by multiplying the first 
column by —1. Finally, we can make a,» satisfy 0 < a» < a, by adding or subtracting 
a multiple of the first column to the second column to finish the process. 


An important quantity associated to a lattice L in Z[T] is the number of parallel 
translates of L, including L itself, that are needed to completely cover all points of 
the larger lattice Z[T]. For example if a,b + cT is a reduced basis for L one can 
first translate L horizontally by the numbers 0,1,- --,a-— 1 to cover all of the x-axis 
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and all rows of Z[tT] containing points of L. Then c translates of these rows in the 
direction of T will cover Z[T] for a total of ac translates of L to cover Z[T]. 

For a lattice L in Z[t] the number of translates of L needed to cover all of Z[T] 
is called the norm of L and written N(L). Any two translates of L are either disjoint 
or coincide exactly, so there is a unique set of translates of L covering Z[T]. Thus 
there is no ambiguity in the value of N(L). As the reader can see by looking at the 
various lattices we have pictured earlier in this section, the norm of a lattice measures 
how “large” or “spread out” the lattice is compared with Z[T]. 

Another way to interpret the norm is in terms of areas. For a basis «,f fora 
lattice L consider the parallelogram P, g with vertices 0, x, B, and «+ $. 


Proposition 8.13. For a lattice L in Z[t] with basis «,B the area of the parallel- 
ogram Px g is independent of the choice of the basis «,B. The ratio of this area 
to the corresponding area for any basis parallelogram for the full lattice Z[t] is 
equal to the norm N(L). 


Proof: The operations (1)-(3) on bases do not change the area of basis parallelograms, 
so every basis parallelogram for L has area equal to the area of P, p+cr for the reduced 
basis a,b + ct for L. To prove the statement about the ratio of areas, note that the 
area of P, 54-7 does not depend on b so we can assume that b = 0. The parallelogram 
Pact decomposes as ac nonoverlapping copies of the parallelogram P; for Z[T], 
so the ratio of the areas is ac, which is the norm of the lattice L=L(a,b+ctT). o 


There is also a more algebraic description of the norm of a lattice L(«,B) in 
terms of determinants. If we write & = a+ bt and B = c + dt then we have the 
associated matrix G 4) . An operation of type (1) adding a multiple of one column to 
the other does not change the determinant of the matrix, while operations (2) and (3) 
only change the sign of the determinant. Since the absolute value of the determinant 
is unchanged by all three types of operations, it can be computed from a reduced 
basis a,b + ct where it is ac, the norm of the lattice. Thus for a lattice L with basis 
a+bt,c+dt we have N(L) = |ad — bc]. 

The sign of the determinant ad — bc has a geometric interpretation as well. We 
will say the basis «, $p is positively ordered if the angle from the ray from 0 through 
« to the ray from 0 through f is between 0 and rr, and if the angle g 
is between 0 and -rr then we say the basis is negatively ordered. 
Reversing the order of two basis elements thus changes the positive 0 


Q 


ordering to the negative ordering and vice versa. The statement is then that «, f is 
positively or negatively ordered exactly according to whether ad — bc is positive or 
negative. To verify this we again use the operations (1)-(3). Operation (1) does not 
change whether a basis is positively ordered or negatively ordered, while operations 
(2) and (3) take a positively ordered basis to a negatively ordered basis and vice versa. 
The sign of the determinant behaves in exactly the same way, so if we go backwards 
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through the sequence of operations converting «,f into a reduced basis, which is 
obviously positively ordered with positive determinant, we see that at each step the 
assertion continues to be true. 

Given a lattice L(«,B6) and a nonzero element y of Z[T] we can multiply all 
elements of L by y to form a new lattice yL = L(ya,yf). To check that this is 
indeed a lattice we should check that yx and yf do not lie on the same line through 
the origin, but if they did then we would have ya = tyf for some real number t, and 
then after canceling y from this equation we would have « = t which would mean 
that « and $ were on the same line through the origin, so L(«,fB) would not be a 
lattice. 

When A < 0 the lattice yL is a rotation and rescaling of L, but for A > 0 the 
geometric relation between the two lattices is not as simple. There is however a sim- 
ple formula relating the norms of L and yL, valid for both positive and negative 
discriminants: 


Proposition 8.14. N(yL) = |N(y)|N(L). 


The absolute value is needed when A > O since norms of lattices are always 
positive but N(y) can be negative when A > 0. When A < O the formula is just 
N(yL) = N(y)N(L) and can be seen geometrically since multiplication by y rescales 
by the distance from y to the origin, which is /N(y), so the areas of parallelograms 
are multiplied by N(y), the square of the rescaling factor. 


Proof: This is a calculation with determinants that will be easier if we regard Z[T] as 
a subset of Q(/A). Let y = p + qVA and let & = a+bVA and B = c + dVA where 
p,q,a,b,c,d are rational numbers. Multiplication by y is a linear transformation of 
QWA): 

(p + qVA)(x + yVA) = (px + qAy) + (qx + py) vA 


The matrix of this transformation is (a ae Thus ya and yf correspond to the 
columns of the product iG 1) (g 5) . The absolute value of the determinant of this 
product is therefore N(yL). This equals the product of the absolute values of the 
determinants of the two individual matrices, which is |N(y)|N(L) since the determi- 
nant of the first matrix in the product is p° — Aq? = N(y) and the absolute value of 


the determinant of the second matrix in the product is N(L). o 


When L is an ideal in R, = Z[T] then sois yL. This is because if « is in L and 
f isin Ra then B(ya) isin yL since it equals y(f&) and this is in yL since Ba is 
in L if L is an ideal. 

An important special case is when L = Ra so yR, is the ideal consisting of all 
multiples of y by elements of R4. This is called the principal ideal generated by y. 
The usual notation for this ideal is simply (y), although this notation can sometimes 
þe a little confusing since parentheses are also used in formulas for multiplication of 
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elements. For example, in the previous paragraph we had an equality B(ya) = y(Ba) 
in which these were just elements of R,, not ideals. However, this equation remains 
valid when (yo) and (Ba) are regarded as ideals since it is always true for principal 
ideals that 6(€) = (6€) so the equation of ideals (ya) = y(Ba) can be written as 
(By) = (yBa) which holds since By = yB. 

Since N(R,) = 1 the preceding proposition gives a simple relationship between 
the norm of an element and the norm of the ideal it generates: 


Corollary 8.15. N((«)) = |N(«)| for each nonzero element « in R,. o 


For negative discriminants, principal ideals (&) = oR, have the same shape as 
the full lattice Ra. Conversely, if an ideal L in R, has the same shape as R, this 
means that L = wR, for some complex number «a, and « has to lie in R4 and in 
fact in L since «& is the element «-1 in wR, = L. As the examples earlier in this 
section show, for some negative discriminants such as —3, —4,—7,-—8, and —11 all 
ideals have the same shape and hence all ideals are principal ideals, while for other 
negative discriminants there can exist nonprincipal ideals since not all ideals have the 
same shape as the principal ideals. 


We have been focusing on the ideals Lg = L(a, p twa) associated to quadratic 


forms Q(x, y) = ax*+bxy+cy* of discriminant A, and it is natural to ask whether 


every ideal in R, is equal to Lg for some form Q of discriminant A. One way to see 


that this is not true is to observe that the lattices Lo = L(a, b ma ) have the special 


property that they contain an element b a lying in the first row of the lattice R4 


above the x-axis, but this is not the case for all ideals since we can expand an ideal 
Lo by a positive integer factor n to get a new ideal nLo which has no elements in 
the first row of R, above the x-axis if n > 1. However, nothing more complicated 
than this can happen: 


Proposition 8.16. Every ideal in R, is equal to nLo for some positive integer n 
and some form Q(x, y) = ax? + bxy +cy* of discriminant A with a > 0. 


Since an ideal Lg has an element in the first row of R, above the x-axis it cannot 
be a multiple nL of any other ideal L with n > 1. We call an ideal with this property 
a primitive ideal, in analogy with the definition of a primitive form. The proposition 
says that all ideals are positive integer multiples of primitive ideals, and the primitive 
ideals are just the ideals Lg coming from forms. 


Proof: We write R, as Z[T] as before. Let L be an ideal in Z[T]. Since L is a lattice 
it has a reduced basis p,m + nt. Then prt lies in L since p does. Since pt is 
in the pr row of Z[t] we must have p = an for some positive integer a. For 
«= m+4+nrt the product at must also lie in L. In the case A = 4D we have Tt = /D 
so OT = MT +NT? = MT +ND. This is in the m” row of Z[T] so n must divide m, 


say m = nq. In the case A = 4d + 1 we have T° = T +d so at = (mM+n)T+nd. 
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jth row of Z[t] so n divides m + n and hence also m so we 


This is in the (m +n 
can again write m = nq. Thus L = L(p,m + nT) = L(na,ng + nT) =nL(a,q+T). 
Here L(a,q + T) is an ideal since nL(a,q + T) is an ideal. 

To finish the proof we would like to find integers b and c such that q+T = BA 
and A = b* — 4ac since L(a,q + T) will then be Lo for Q = ax? + bxy +cy° with 
discriminant A. Consider first the case A = 4D so q+ tT = q+VD. This is an element 
of the ideal L(a,q + vD) so if we multiply it by its conjugate q + T = q — VD we get 
an integer lying in L(a,q + vD). This integer must be a multiple of a, the smallest 
positive integer in L(a,q + VD), so we have (q+ T)(4 +T) = (4 + VD)(q - VD) = 
q? -D = ac for some integer c. Hence (2q)* —4D = 4ac, and since 4D = A this can 
be rewritten as A = b° — 4ac for b = 2q. We also have q+ T = q+ VD = 24 s 
the case A = 4D is finished. 


In the other case A = 4d + 1 we again look at the product (q +T)(q+T). By the 


(0) 


same reasoning as in the first case this must be a multiple of a, so (¢q+T)(q+T) = ac 
for some integer c. Writing this out, we have (q + aA) (q+ 1-8) = ac. Multiplying 
this equation by 4 gives (2q + 1 + VA)(2q + 1 — VA) = 4ac which simplifies to 
(2q + 1)* -A = 4ac. Thus if we take b = 2q + 1 we have A = b° — 4ac and 


q+t=q+ 134 = 24) This finishes the case A = 4d + 1. oO 


The preceding proposition allows us to relate norms of ideals to the representa- 
tion problem for forms. As we know, the numbers represented by the principal form 
of discriminant A are just the norms of primitive elements of R, . If we now consider 
all forms, not just the principal form, then there is an analogous statement for norms 
of ideals in R,: 


Proposition 8.17. The positive numbers represented by forms of discriminant A are 
exactly the norms of primitive ideals in R,. More specifically, the positive numbers 
represented by a form Q are exactly the norms of ideals Lg, associated to forms 
Q’ equivalent to Q. 


Since the norms of arbitrary ideals are just squares times the norms of primitive 
ideals, it follows that the norms of all ideals are just the positive values of all forms 
of the given discriminant. 


Proof: If a positive number a is represented by a form of discriminant A then this 


b+VJA ) 
2 


has norm a and is primitive. Thus all positive represented numbers are norms of 


form is equivalent to a form ax? + bxy + cy*. The associated ideal L(a, 


primitive ideals. Conversely, by Proposition 8.16 every primitive ideal can be written 


as the ideal L(a, Bee associated to a form ax? +bxy+cy* of discriminant A with 


a > 0. This form represents a and the ideal L(a, h 1a) has norm a, so all norms of 


primitive ideals are represented by forms. o 
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Let us look at an example, the case A = —24 with R, = Z[—6]. Here the class 
number is 2 corresponding to the forms x° + 6y* and 2x* + 3y°. 


x*+6y? 2x? + 39? 


To each form ax*+bxy+cy? of discriminant —24 we have the associated primitive 
ideal L(a, 24) = L(a, £ + V76) of norm a. This corresponds to a region labeled 
a in one of the two topographs, with b the label on one of the edges bordering this 
region. The sign of b depends on the orientation of this edge, and in the topographs 
shown above we have oriented the edges to make all edge labels positive. We could in- 


stead orient the edges surrounding the a region so that their labels form an arithmetic 
progression with increment 2a when traversed in the clockwise direction around the 
border of the a region. Then there is a unique edge such that 0 < b < 2a, or equiv- 
alently 0 < b < a, which is exactly the condition for the basis a, £ + J/-6 to be a 
reduced basis for the ideal L(a, } + /—6). Thus there is an exact one-to-one corre- 
spondence between primitive ideals and regions in the two topographs since any two 
regions with the same a and b labels must be related by an orientation-preserving 
symmetry of the topograph, but these topographs have only mirror symmetry. 

For example, ideals of norm 5 correspond to regions labeled 5 in the two to- 
pographs, and there are just two of these, both in the second topograph, with the 
upper region corresponding to L(5,2 + /—6) (from the edge labeled 4) and the lower 
region corresponding to L(5,3 + ./—6) (from the edge labeled 6). Thus these are the 
only two ideals of norm 5. These two ideals are conjugate since the conjugate of 
L(5,2+ V/-6) is L(5,2 — /—6) = L(5, -2 + /—-6) = L(5,3 + V—6). This happens gen- 
erally for all regions in the topographs, as conjugate ideals are obtained by reflecting 
across the horizontal line of symmetry of the topographs. The two regions in each 
topograph that intersect the symmetry line correspond to ideals that equal their con- 
jugate, namely L(1,. /—6) = Z[,/—6] and L(6, ./—6) = (./—6) for the first topograph, 
and L(2,./—6) and L(3, /—6) for the second topograph. 

Nonprimes can appear more than twice in the topographs, as happens for 35 
which appears four times. From these regions we can read off the four ideals of 
norm 35. In the upper half of the second topograph the two regions labeled 35 give 
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the ideals L(35,8 + /—6) and L(35, 13 +-./—6) and in the lower half of the topograph 
we have their conjugates L(35,27 + /—6) and L(35,22 + /—6). 

The ideals corresponding to regions in the first topograph are principal ideals 
since the form here is the norm form N(x + y/—6) = x?’ + 6y°. For example the 
label 25 in the upper right is the norm of the ideal L(25,13 + /—6), from the edge 
labeled 26, and similarly the label 25 in the lower right is the norm of the ideal 
L(25,12 + /—6). These two regions correspond to the fractions Ly = +1 so 25 is 
the norm of 1+2./—6 and 1—2,/-6, hence also of the principal ideals (1+2./—6) and 
(1 + 2\/—6). The principal ideal (5) has norm 25 as well but is not a primitive ideal. 
The ideal L(25, 13 + /—6), being primitive, must therefore be either (1 + 2\/—6) or 
(1 — 2,\/—6). To decide which, we need to determine which of the two principal ideals 
contains 25 and 13 + /—6. They both contain 25 since 25 = (1 + 2.,/—6)(1 — 2-6) 
so we need to determine whether 13 + /—6 is a multiple of 1 + 2\/—6 or of 1 — 2-6 
by an element of Z[,/—6]. This is done by computing the relevant quotients: 


13 + V=6 _ 13+ V-6 1-2v-6 _ 25-25V-6 | _ 
ta27-6 1427-6: 1-27-6 25 
13+V-6 13+Ẹv-6 1+2v-6 1+27v-6 
Tsip. Te? TNS 
This last quotient is notin Z[V=6] so we conclude that L(25,13+Łv=6) is the principal 
ideal (1 + 2vV—6). Taking conjugates gives L(25,12 + /—6) = (1 — 2v—6). 


For most negative discriminants the same one-to-one correspondence holds be- 
tween primitive ideals and regions in the topographs for that discriminant, where for 
topographs without mirror symmetry we should take both the topograph itself and its 
mirror image topograph. The only exceptional negative discriminants are A = —3 and 
A = —4, the two cases when the topographs have orientation-preserving symmetries. 
In these cases the regions that correspond to each other under orientation-preserving 
symmetries correspond to a single primitive ideal. For positive discriminants the sit- 
uation is very similar, the only differences being that one only considers regions in 
the topographs with positive labels, and then the primitive ideals correspond to re- 
gions within one period of the periodic topograph since the orientation-preserving 
symmetries are just the translations along the periodic separator line. 


As we Saw in Chapter 6, a key part of the problem of determining which numbers 
are represented by forms of a given discriminant is determining which primes are 
represented. The corresponding problem for ideals is to determine which primes p 
are norms of ideals in R,. These ideals must be primitive, the ideals L(p, p +A) for 


A a square mod 4p, namely A = b* mod 4p, coming from the equation A = b* — 4ac 


with a = p. 
In Proposition 6.15 we saw that if a prime p is represented by a form of discrim- 
inant A then this form is unique up to equivalence. Furthermore, by Proposition 6.16 
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all the appearances of p in a topograph are images of each other under symmetries 
of the topograph. This means that there are at most two ideals in R, of norm p, the 
ideal L(p, eE) and its conjugate L(p, & beva) = L(p, Aj, When the ideal and its 
conjugate are equal there is only one ideal of norm p. 


Proposition 8.18. (a) The ideals in R, of prime norm p with p odd are: 
= For A = 4d, the ideal L(p,B + Vd) and its conjugate L(p,—B + vd), where 
d = B? mod p. 
= For A odd, the ideal L(p, B+ iva) and its conjugate L(p,-B-1+ hy, where 
A = (2B +1)? mod p. 
(b) The ideals in Ra of norm 2 are: 
= For A =4d with d even, the ideal L(2,V/d). 
= For A =4d with d odd, the ideal L(2,1+ Vad). 
« For A = 8k +1, the ideal L(2, va) and its conjugate L(2,1 + ToO 
(c) An ideal of prime norm p equals its conjugate if and only if p divides A. 


Proof: The condition for p to be the norm of an ideal in R, is that A = b* mod 4p 
for some integer b, and the ideal is then L(p, BAY, If A = 4d then b must be even 
so b = 2B for some integer B. The congruence A = b* mod 4p is then equivalent 
to d = B? mod p. The ideal in this case is L(p,B + Vd). If A is odd then so is b 
and we can write b = 2B + 1. The congruence A = b? mod 4p is then A = (2B + 1)? 
mod 4p. This implies A = (2B + 1)? mod p and the converse is also true since 
A = (2B + 1)? mod 4 when A is odd, both sides of this congruence being 1 mod 4. 
The ideal L(p, bev) is then L(p,B + naj This finishes part (a). 

When p = 2 the congruence A = b* mod 4p becomes A = b? mod 8 which is 
solvable just when A = 0,1,4 mod 8, with solutions b = 0,1, 2. This gives the ideals 
in part (b). (The first two ideals equal their conjugates so there is no need to include 
their conjugates.) 


For part (c) the condition for L(p, b t8) to equal its conjugate is that p divides 
b, by Proposition 8.12. When p is prime this is equivalent to p dividing A since 


A=b°-4pc. o 


We have seen how to go from a quadratic form Q to an ideal Lg, and it will be 
useful to go in the opposite direction as well, from an ideal L in R, to a quadratic 
form Q, of discriminant A. As motivation we can start with the earlier formula 
aQ(x, y) = N(ax + bev 
Q(x, y) = ax? + bxy +cy? can be obtained by restricting the usual norm in Ra to 
the elements ax + 2 twas y in the ideal Lg. We can try the same thing for any lattice 
L=L(a,B) in Ry, defining a quadratic form by: 


>y) which says that, up to the constant factor a, the form 


Q(x, y) = N(ax + By) = (ax + By) (Ax + By) = anx’ + (ab + KB)xy + BBY? 
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Here the coefficients of x°, xy, and y° are integers since they are equal to their 
conjugates. The form Q depends on the choice of the basis «, ß for L. Another basis 
a’, B’ can be expressed as linear combinations « = px+qB, P = ra + sp with 
integer coefficients. Since the change of basis can be reversed, going from o’, B’ back 
to «x, B, the 2 x 2 matrix (? 2) has determinant +1, and conversely any such matrix 
gives a valid change of basis for L. Changing the basis also produces a change of 
variables in the form Q(x, y) since N(a’x + B’yv) = N((pa+qB)x + (rx+sB)y) = 
N(a(px+ry)+(B(qx+sy)) = Q(pxt+ry,qx+sy). Here the matrix is the transpose 
(Z os with the same determinant +1. Thus changing the basis for L produces an 
equivalent form, and every equivalent form can be realized by some change of basis 
for L. 

The form N(ax + By) depends on the ordering for the two basis elements « 
and £ since reversing their order interchanges x and y, which gives a mirror image 
topograph. We can eliminate this ambiguity by always using the positive ordering for 
& and $. If we only use positively ordered bases, then the change of basis matri- 
ces have determinant +1 since a change of basis transformation takes a positively 
ordered basis to a positively ordered basis if and only if its matrix has positive deter- 
minant. This is because changing a basis amounts to replacing its matrix (F a) bya 
product te t o S) with i 7 the matrix of the change of basis. Thus if we always 
use positively ordered bases, the lattice L gives rise to a proper equivalence class of 
quadratic forms. 

The norm form N(ax + By) associated to a lattice L = L(&, f) in Ra might not 
have discriminant A. For example, if we replace L by nL = L(na,nf) this multiplies 
the norm form by n? and so the discriminant is multiplied by n*. We can always 
rescale a form to have any discriminant we want just by multiplying it by a suitable 
positive constant, but this may lead to forms with noninteger coefficients. To illustrate 
this potential difficulty, suppose we take A = —4 so Ra = Z[i]. The lattice L(2,1) in 
Z[i] yields the form N(2x + iy) = 4x* + y? of discriminant —16, but to rescale this 
to have discriminant —4 we would have to take the form 2x? + f y’. 

Fortunately this problem does not occur if we consider only lattices that are ideals. 
By Proposition 8.16 each ideal L in R, is pal to a multiple nLo = L(na, neaj 
for some form Q(x, y) = ax* + bxy + cy? of discriminant A with a > 0. We have 
aQ(x, y) = N(ax + BA y), hence n*aQ(x,y) = N(nax + n»i y) which is the 
norm form for L in the basis na,n24 mA . This basis is positively ordered since a > 0. 
By dividing this norm form for L by nĉa we get a form with integer coefficients and 
discriminant A, namely the form Q. If we change the basis na, aa iva for L to 
some other positively ordered basis «, it is still true that the form = Ly (ax + By) 
has integer coefficients and discriminant A since this just changes 5 to a properly 
equivalent form. 
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Note that the scaling factor na is the norm N(L) of the ideal L = nL(a, 2 HS 


Thus we have shown: 


Proposition 8.19. For an ideal L in Ry with positively ordered basis «, B the form 
WN (ax + By) has integer coefficients and discriminant A. m 


For an ideal L with positively ordered basis «œ, 6 the form WN (ax + By) will 
be denoted by Q,, although a more precise notation might include « and f since the 
form depends on the choice of basis. 

Different ideals L in R4 can give properly equivalent forms Qz. Obviously a 
rescaling nL of L gives the same form Q,,, = Qz. More generally, suppose we multi- 
ply all elements of an ideal L = L(a, f) bya fixed nonzero element y of Ry, to get a new 
ideal yL = L(ya, yf). Taking norms, we have N(yax + ypy) = N(y)N(ax + By), 
so if N(y) > 0 the new form N(yax + yfy) is just a rescaling of N(ax + By), with 
rescaling factor N(y). Thus after rescaling to get discriminant A we have Qy; = Q; 
when N(y) > 0. Specifically, if we use the formula N(yL) = |N(y)|N(L) then when 
N(y) > O we have: 


N(yax+yBy)  N(y)N(ax+By)  N(ax + By) 
N(yL) — NNO — NG 
As a technical point, we should check that ya, yf is positively ordered if œ, ß is 
positively oriented. When A < O this is automatic since multiplication by y just 
rotates and rescales the plane. When A > 0 we can argue as follows. As we saw in the 
proof of Proposition 8.14, multiplication in Q(/A) by a fixed element y = p + q4 vA 
is a linear transformation with matrix $ Aa. This has determinant p° — Aq* = 
N(p+qvVA), soif N(y) > 0 the matrix corresponding to the basis ya, yf has positive 
determinant exactly when the matrix corresponding to «, B has positive determinant. 

When A < 0 we always have N(y) > 0, but when A > 0 it is possible to have 
N(y) < 0. In this case the form N(yax + ypy) is the negative of a rescaling of 
N(ax + By) and the basis ya, yf is oppositely ordered from a, 8, so Q,, is the 
negative of the mirror image form of Q,. 

Since the forms Q; and Q,, are properly equivalent when N(y) > 0, we would 
like to regard the ideals L and yL as being equivalent. Any reasonable notion of 
equivalence should have the property that two things equivalent to the same thing are 
equivalent to each other, but this does not seem to hold for the notion of equivalence 
that we just considered since if two ideals L and L’ are equivalent to the same ideal 
yL =y’'L’ for some y and y’ in R4, then it does not follow that L’ = ôL or L = ôL’ 
for some 6 in R, since the quotients y/y’ and y’/y might not lie in R,. 

To avoid this difficulty we define two ideals L and L’ in R, to be equivalent, 
written L ~ L’, if yL = y’L’ for some nonzero elements y,y’ in R,. If in addition 
N(y) > 0 and N(y’) > O then we say L and L’ are strictly equivalent and write 
L = L’. In particular we have L ~ yL for each nonzero y in R; since if we let L’ = yL 
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and y’ = 1 then the equation yL = y’L’ becomes just yL = L’. Similarly, L ~ yL for 
every y with N(y) > 0. 

Conversely, a general equivalence L ~ L’ can be realized as a pair of equivalences 
of the special type originally considered, namely L ~ yL = y'L’ ~ L’ and likewise for 
strict equivalences. Thus we have not really changed the underlying idea by defining 
the two kinds of equivalence ~ and ~ as we did. What we have gained is the property 
that two things equivalent to the same thing are equivalent to each other, which can 
be expressed as the assertion that if L ~ L’ and L’ ~ L” then L ~ L”. This holds since 
if yL = y'L’ and L’ = 6'L” then yL = y'L' = 6'y'L” so L~ L”. This reasoning 
also works with ~ in place of ~ by adding the condition that all of y, y’, ô,” have 
positive norm, hence all their products have positive norm as well. 

For negative discriminants there is no difference between equivalence and strict 
equivalence of ideals since norms of nonzero elements of R, are always positive, 
but for positive discriminants there can be a difference. This happens for example 
when A = 12. Here the two forms x° — 3y* and 3x* — y° correspond to the ideals 
(1, V3) = (1) and (3, V3) = (V3) in Ra = Z[V3]. These two ideals are equivalent 
since (/3) = y(1) for y = V3. However, N(/3) = —3 so this does not show the 
ideals are strictly equivalent. In fact they are not strictly equivalent since if they were, 
then the forms x? — 3y* and 3x* — y? would be properly equivalent, but this is not 
the case as one can see from their topographs or from the fact that the character x; 
takes the value +1 on the first form and —1 on the second form. 

This example can be contrasted with the case A = 8 with Ra = Z[/2]. Here 
the two forms x° — 2y? and 2x* — y? correspond to the ideals (1, v2) = (1) and 
(2,./2) = (v2). Again the two ideals are equivalent since (v2) = y(1) for y = V2, 
with N(./2) = —2. There is a unit £ = 1+ v2 of norm —1 so we have (V2) = (€V/2) = 
(2 + /2) = y(1) for y = 2+ V2 with N(2 + v2) > 0 and hence the ideals (1) and 
(V2) are strictly equivalent. In fact the forms x° — 2y° and 2x* — y° are properly 
equivalent as one can see from their topographs. 

In the previous example with A = 12 there is no unit of norm —1 since —1 is 
represented by the form 3x° — y? but not by the norm form x° — 3y°. As we will 
now see, the distinction between equivalence and strict equivalence of ideals is entirely 
accounted for by the existence or nonexistence of units of norm —1. 


Proposition 8.20. For positive discriminants A the relations of equivalence and 
strict equivalence of ideals in Ry are the same if and only if there is a unit in Rj 
of norm -1. 


Note that it suffices to consider only the fundamental unit since if this has norm 
+1 then all units have norm +1. 


Proof: Suppose there is a unit £ in Ra with N (€) = —1 and suppose two ideals L and 
M are equivalent via an equality &L = BM. We have aL = e&L so we can arrange that 
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N(«a) > 0 by replacing « with ex if necessary. In the same way we can arrange that 
N(B) > 0. Thus L and M are strictly equivalent. 

For the converse, suppose equivalence is the same as strict equivalence. Since we 
assume A > 0, there exist elements « in Ra with N(a) < 0. The ideals Ra and aR, 
are equivalent so by hypothesis they are strictly equivalent. This means BR, = yaR, 
for some elements f and y in R, of positive norm. Since $ is in BR, = y&R, we 
have f = yxo for some 6 in R,. Also ya is in yaR, = PR; so ya = Be for some 
€ in Ra. Thus f = yao = peð and hence 1 = cô since $ + 0. Thus ô and € are 
units. The equation ya = Be implies that N(g) < 0 since N(y) > 0, N(a) < 0, and 
N(B) > 0. Since € is a unit, its norm is then —1. oO 


Now we come to the main result in this section: 


Theorem 8.21. There is a one-to-one correspondence between the set of strict equiv- 
alence classes of ideals in R, and the set of proper equivalence classes of quadratic 
forms of discriminant A. Under this correspondence an ideal L with a positively 
ordered basis «, B corresponds to the form Q, (x,y) = WN (ax + By) , anda form 
Q(x, y) = ax* + bxy +cy* with a > 0 corresponds to the ideal Lo = L(a, 2A), 


(When A < 0 we are considering only forms with positive values, as usual.) 


For example, when all forms of discriminant A are equivalent and hence properly 
equivalent, the theorem says that all ideals are strictly equivalent. When A < 0 this is 
saying that all ideals have the same shape, or equivalently that all ideals are principal 
ideals. The negative discriminants for which this happens are —3, —4, —7,—-8,-11, 
—19, -43,-67, and —163. For the first five of these we already saw that all ideals 
have the same shape using a geometric argument, but that argument does not apply 
in the last four cases. 

The condition a > 0 in the theorem plays a role only when A > 0, but its role is 
sometimes important. For example, the principal form x* + bxy +c y? corresponds 
to the ideal L (1, enn) which equals R, since it contains 1, but without the condition 


a > 0 the negative of the principal form would correspond to L(-1, = twa) which 
also equals R, since it contains —1. However, for some values of A suchas A = 12 
the principal form is not equivalent to its negative. 


Proof: Let © be the function from the set of strict equivalence classes of ideals to 
the set of proper equivalence classes of forms induced by sending an ideal L witha 
positively ordered basis «, B to the form Q(x, y) = N(ax + By)/N(L). The function 
® is well defined since we have seen that changing one positively ordered basis for L 
to another changes the associated form to a properly equivalent form, and replacing L 
with basis «œ, ß by yL with basis ya, yB leaves the form unchanged when N(y) > 0. 

To see that ® is onto, note first that in each proper equivalence class of forms 
there are forms Q(x, y) = ax’ + bxy + cy? with a > 0 since the topograph of an 
elliptic or hyperbolic form always contains some positive numbers, so we can choose 


Section 8.3. — The Correspondence Between Forms and Ideals 297 


Q so that Q(1,0) > 0. Then Q = Q; for the ideal L = Lg = L(a, beva) since 
Q; =N(ax + BIVA y)/N(L) = ax° + bxy + cy°, using the fact that N(L) =a. 

To show that ® is one-to-one, suppose we have two ideals L and L’ with positively 
oriented bases «, $ and a’, B’ such that the associated forms Q; and Q,, with respect 
to these bases are properly equivalent. We can assume the basis «, 6 is chosen so that 
Q, (1,0) > 0. Since Q; and Qr are properly equivalent we can then choose «’, 8’ so 
that we have actual equality Q; (x,y) = Qr (x,y) for all x and y. We have N(«) = 
Q,(1,0):N(L) > 0 and N(a’) = Q,,(1,0)-N(L’) > 0 since Q, (1,0) = Qp (1,0) > 0. 

The forms N(ax + By) and N(a’x + B’y) are rescalings of each other since 
they rescale to the same form Q, (x,y) = Q;/(x,y). Let y = B/a and y’ = B'o, 
elements of Q(/A). We have N(ax + By) = N(&)N(x + yy) and N(a’x + By) = 
N(a’)N(x+y’y) so the two forms N(x+yy) = N(ax+By)/N(a) and N(x+y’y) = 
N(x + B’y)/N(a@’) are also rescalings of each other. Note that these two forms 
have rational coefficients, not necessarily integers. Since the forms N(x + yy) and 
N(x + y’y) are rescalings of each other and take the same value at (x,y) = (1,0), 
namely N(1) = 1, they must actually be equal. 

Next we show that in fact y = y’. Let y = 7r+sVA and y = r' +s'VA with 
r,s,r,s' in Q. We have N(x + yy) = N(x + y’y) for all integers x and y so in 
particular N(y) = N(y’) which means r*—s*A = r°-s°A. Also N(1+y) = N(1+y’) 
so the difference N(1 + y) — N(y) = ((r + 1)* — s*A) — (r°? — s?A) = 2r + 1 equals 
the difference N(1 + y’) — N(y’) = 2r’ + 1 and hence r = r’. From the earlier 
equation r° — s*A = r”? — s’°A we then get s = +s’. The bases 1,y and 1,y’ are 
positively ordered since this was true for «, 8 and a’, 6’ and multiplication by « and 
a’ preserves orientation of the plane since N(«) > 0 and N(a’) > 0. Since both 1, y 
and 1,y’ are positively ordered we must have s > 0 and s’ > 0 so s = s’. Thus 
y =y’ as claimed. 

The lattice L(1, y) may not lie in R4 since y is only an element of Q(/A), but we 
can rescale L(1, y) toa lattice nL(1,y) = L(n,ny) in Ra by multiplying by a positive 
integer n such that ny is in R4. Using the symbol ~ to denote strict equivalence of 
ideals, we then have: 


L=L(a,B) = nL(a, fp) = L(na,nBp) = L(na,nay) = aLl(n,ny) = L(n, ny) 


Similarly, L’ = L(n’,n’y’) for some positive integer n’, but we can choose n’ = n 
since y = y’. Thus both L and L’ are strictly equivalent to L(n,ny) so they are 
strictly equivalent to each other. This finishes the proof that ® is one-to-one. o 


To illustrate the theorem consider the case A = 60 where there are four proper 
equivalence classes of forms, given by x° —-15y?, 15x?—y*, 3x? -5y and 5x°-3y°. 
The corresponding ideals in Ry = Z[V15] are (1, v15) = (1), (15, v15) = (V15), 
(3, V15), and (5, v15). According to the theorem no two of these ideals are strictly 
equivalent, although the first two are equivalent since (V15) = /15(1) and the second 


298 Chapter 8 — Quadratic Fields 


two are equivalent since /15(3, VI5) = 3(5, V15). This corresponds to the fact that 
the two forms in each pair are negative mirror images of each other, although all four 
forms have mirror symmetry so taking mirror images makes no difference. 

For another example take A = 136 with class number 4 realized by the forms 
x°-34y", 34x*-y*, and 3x?°+2xy-—11y° as we saw in an example in Section 7.4 that 
displayed an interesting combination of symmetry and skew gage properties. In 
Ry, = Z[/34] the four forms correspond to the ideals (1, /34) = (1), (84, V34) = 
(./34), and (3,1 +-./34). The first two ideals are obviously ie. i the second 
two, if we multiply (3,1 + /34) by some y with N(y) < 0, for example y = v34, we 
get an ideal corresponding to the negative mirror image of the form 3x°+2xy-11y°. 
The topograph of this form has rotational skew symmetries but no mirror symmetries, 
so its negative mirror image is 3x* — 2xy — 11y°. Thus V34(3,1 + V34) must be 
strictly equivalent to (3,1 — V34), so (3,1 + v34) and (3,1 — v34) are equivalent but 
not strictly equivalent. This is true also for the other two ideals (1) and (v34) but for 
a different reason since the forms x° — 34y* and 34x* — y* have mirror symmetry 
but no skew symmetries rather than vice versa. 


The correspondence between forms and ideals includes nonprimitive forms as 
well as primitive forms, but the ideals corresponding to primitive and nonprimitive 
forms behave somewhat differently. Let us illustrate this by the example of discrimi- 
nant A = —12 where there are two equivalence classes of forms, given by the primitive 
form x? + 3y* and the nonprimitive form 2x* + 2xy + 2y?. 


x? + 3y? — L(1,V—3) 2x? + 2xy+2y* <> L(2,1+ V3) 
na a a a a a A a i eo @ 0 @-0-@- 0 @ 0 @ 0-@ 
oo 9-9-9 0-9-0 00-99 -* CE 0 @ 0 @ 0 @ 0 @ 0 @ 0 
oo fo F 0 9 9% 9 Oo i 0-0 K-00 0k 0h 
ooeoerereeeceeeee o% a a ooo 


The ideal for 2x* + 2xy + 2y° is a lattice of equilateral triangles, and this lattice has 
the special property that it is taken to itself not just by multiplication by elements 
of Ra = Z[/—3] but also by the 60 degree rotation given by multiplication by the 
element w = (1 +./—3)/2 in the larger ring Z[w] which is R, for A = —3. Hence the 
lattice L(2, 1 + ./—3) is taken to itself by all elements of Z[w] and so this lattice is an 
ideal in Z[w], not just in the original ring Z[L/—3]. 

More generally, suppose we start with a form Q = ax* + bxy + cy° of dis- 
criminant A and then consider the nonprimitive form kQ = kax? + kbxy + kcy? 
of discriminant k*A for some integer k > 1. The associated ideal Lyg is then 
L(ka, KbtkVA) _ = kL(a, kiva) = kLg. This is an ideal not just in Rye, but also in 
the larger ring Ra since it is k times an ideal in Ry, namely k times Lg. 
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Let us say that an element « in Q(/A) stabilizes an ideal L in R, if aL is 
contained in L, and let us call the set of all such elements « the stabilizer of L. The 
stabilizer of L contains R, and is a ring itself since if two elements « and £ in Q(/A) 
stabilize L then so do «+f and ag. If the stabilizer of L is exactly Ry, then we will 
say that L is stable. 

For example, principal ideals (y) are stable since if «(y) is contained in (y) then 
in particular wy is in (y) and so we have ay = By for some f in Ry. Canceling y, 
we then have « = $ so « is an element of R,. 


Proposition 8.22. A form Q of discriminant A is primitive if and only if the corre- 
sponding ideal Lg in R, is stable. 


Proof: We observed above that a nonprimitive form Q of discriminant A gives an 
ideal Lg with stabilizer larger than Ra. For the converse we wish to show that if 
Q = ax? + bxy + cy? is a primitive form of discriminant A then Lo is not an ideal 
in any larger ring than R, in Q(/A). Let us write Lg as L(a,T) for T = a Note 
that R, = Z[T] since b has the same parity as A. Also Q(\/A) = Q(T). 

Suppose we have an element « =7+sT in Q(T) such that wL(a,T) is contained 
in L(a,t). Here r and s are rational numbers. Our goal is to show that Q being 
primitive forces r and s to be integers. This will say that « is in Z[T] = R4, and 
hence that R; is the stabilizer of L(a,T). 

Since xL(a,T) is contained in L(a, T), both œa and at are in L(a,T). We have 
Xa = ra+satT, and for this to be in L(a, T), which consists of the linear combinations 
xa + yT with x and y integers, means that r is an integer and sa is an integer. It 
remains to show that «tT being in L(a,T) implies that s is an integer. 

To do this we first compute «Tt using the fact that T is a root of the equation 
x° —bx + ac =0 so T°? = bt — ac. Then we have: 


OT =7T +5T° =rT +s(bT — ac) = -sac + (r + sb)t 


For this to be in L(a, Tt) means that sc and r + sb are integers. We already know that 
r is an integer, so r + sb being an integer is equivalent to sb being an integer. Thus 
we know that all three of sa, sb, and sc are integers. Let us write s as a fraction © 
in lowest terms. Then sa = ma is an integer, so n must divide a. Similarly sb and 
sc being integers implies that n divides b and c. But 1 is the only common divisor 
of a, b and c since the form ax? + bxy + cy? is primitive, so n = 1. Thus s is an 
integer and we are done. o 


A Digression on Shapes of Lattices 


Let us go into a little more detail about the shapes of lattices in the plane. This 
will not be used in the rest of the chapter, although it does provide some enlightening 
context. Lattice shapes are mostly of interest for negative discriminants, but for the 
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following discussion we will consider all possible lattices in the plane, without regard 
to whether they lie in some ring R, or not. 

Recall that we say two lattices have the same shape if one can be transformed 
into the other by rotation and rescaling of the plane. With this definition of shape 
one can ask whether it is possible to characterize exactly all the different shapes of 
lattices. We will give such a characterization and then see how this relates to forms 
of negative discriminant. 

First let us get a global picture of all the possible shapes of lattices in the plane. 
Given a lattice L, choose a point in L that is closest to the origin, other than the 
origin itself. We can rotate L about the origin until this point lies on the positive 
x-axis, and then we can rescale L until this point is at distance 1 from the origin, 
so it is the point (1,0), or in other words the complex number 1. Now choose a 
point « in L closest to the origin among all points of 
L above the x-axis. Thus « lies on or outside the unit 
circle x? + y* = 1. Also, œ must lie in the vertical a 
strip consisting of points x + yi with -1 < x < lh, 
otherwise there would be another point of L inside 
this strip that had the same y-coordinate as « and 
was Closer to the origin than «. This is because all 
points of L lie in horizontal rows of points of distance 
1 apart. The lattice L(1,œ) is contained in L and in -1 -I2 0 te 1 


fact must equal L by the way that we chose «a. (There are no other points of L above 
the x-axis and inside the circle x* + y* = r° passing through «.) 

Let R be the region of the plane consisting of the points « as above, that is, all 
a =x + yi with x? + y? > 1, -ə <x < 2, and y > 0. 


Proposition 8.23. The lattices L(1, &) with « in R realize all lattice shapes, and of 
these lattices the only ones having the same shape are the pairs L(1, Y> + yi) and 
L(1,-—'% + yi) and the pairs L(1,x + yi) and L(1,-x + yi) with x? + y? =1. 


Note that these pairs all lie on the boundary of R, either on the vertical edges 
or on the circular arc forming the lower edge of R. The two points of each pair are 
mirror reflections of each other across the y-axis. 


Proof: We have already seen that all lattices have the shape of a lattice L(1, œ) for 
some « in R, and it remains to see when two of these lattices L(1, ~) have the same 
shape. A more basic question is when two of the lattices L(1,«) and L(1, 6) with 
& and f in R are the same lattice. If this happens, the y-coordinates of « and f 
must be the same since this is the coordinate of points in the first row of the lattice 
above the x-axis. The x-coordinates of « and $ must then differ by an integer if 
L(1, «) = L(1, B), soif « and f are both in R the only possibility is that « and Bf are 
points 1⁄2 + yi and —!/ + yi on the two vertical edges of R. 
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For L(1,«) and L(1,f) to have the same shape means that there is a rotation 
and rescaling taking one to the other. However, there can be no rescaling since the 
smallest distance from nonzero points in these two lattices to the origin is 1 in both 
cases. To see what sorts of rotations are possible, consider the subsets Cx of L(1, x) 
and Cg of L(1,f) consisting of the lattice points at distance 1 from the origin. If 
there is a rotation taking L(1, œ) to L(1, £) then this rotation carries C, onto Cg. In 
particular, C, and Cg must have the same number of points. The points 1 and —1 
always belong to C, and Cg. If these are the only points in C, and Cg then the only 
rotations taking C, to Cg are rotations by 0 and 180 degrees, but these do not affect 
the lattices so we must have L(1, œ) = L(1, P) in this case. If C, and Cg have more 
than two points then C, will include + and Cg will include +f. If Cy = {+1, +a} 
and Cg = {+1,+f} then the only way for C, to be a rotation of Cg is for the two 
arcs in the upper half of the unit circle joining « to 1 and to —1 to have the same 
lengths as the two arcs from £ to 1 and —1, after 
possibly interchanging the two arcs for « or f as in 
the figure. This implies that f is equal to either & or -1 1 -1 1 
the reflection of « across the y-axis. Thus L(1, «) 
and L(1,ß) are L(1,x + yi) and L(1,-x + yi) for -=a -P 
some x and y with x° + y* = 1. The remaining possibility is that C, and Cg contain 
more that four points, but this only happens when they are the vertices of regular 
hexagons inscribed in the unit circle since the points of C, must be of distance at 
least 1 apart, and likewise for Cg. In this hexagonal case we have L(1, a) = L(1, P), 
finishing the proof. oO 


Let us see now how the lattices Lg = L(a, 1a) associated to elliptic forms 


Q = ax? + bxy + cy’ fit into this picture. Here a and c are positive since we only 


consider positive elliptic forms. For the two basis elements of Lg we have N(a) = a° 
and N(=) = biva . bova = bs = ac. If we assume that Q is reduced, so 
0<b<a<c, then N(a) < N(EY 4), Also the x-coordinate of 2o which is 
b/y, is at most 4⁄2. From these facts we can deduce that a is the closest point in 
Lo to the origin. Then when we rescale Lo by shrinking by a factor of a we get the 
lattice L(1, œ) with & = k a, with « lying in the right half of the region R since 
N(&) <1 and 0< 2 < Z. Conversely, if bev 
0 <b <a <c. Thus Q is reduced exactly when the rescaled Lo is L(1, &) with « in 
the right half of R. 

If we replace Q by nQ then Lo is replaced by L(na, as = NLg so this is 


just a rescaling of Lg with the same shape and hence corresponding to the same point 


is in the right half of R then we have 


& in R. Apart from rescaling Q in this way, different reduced forms give different 
points « in R since the x-coordinate 24g of « determines the ratio b/, and the 
norm of « gives the ratio Yq. 

Any point « in the right half of R with rational x-coordinate and rational norm 
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arises in this way from a reduced elliptic form Q. For example for an x-coordinate of 
1⁄2 and a norm of 5⁄4 we have Pg = 1⁄3 and “/g = 7⁄4. Rewriting these two fractions 
with a common denominator, we get 4,2 and 15⁄2. Then after writing 4⁄2 as 8/4 
we can choose a = 12, b = 8, and c = 15, producing the form 12x* + 8xy + 15y°. 

Points in the left half of the region R are realized by replacing b by —b, so 
the form ax? + bxy + cy is replaced by its mirror image form ax? — bxy + cy” 
which is equivalent but not properly equivalent unless ax? + bxy + cy? has mirror 
symmetry. The reduced forms with mirror symmetric topographs are those where one 
of the inequalities 0 < b < a < c becomes an equality. When b = 0 we have the forms 
ax* +cy* corresponding to the lattices L(1, ¥2 oa vA) along the y-axis in R. These are the 
rectangular lattices, with mirror symmetry across the y-axis. When b = Z we have 


the forms ax? + axy + cy* whose associated lattices L(1, atv) _ = L(1,4 5+ va) lie 
along the right-hand edge of R. These lattices also have ae avert across the 
y-axis since a equal their aie image lattices L(1, -4 + Th Finally, if A =c we 
have forms ax* +bxy +ay° corresponding to lattices L(1, 2 mA) with be 
norm */, = 1 and hence lying on the arc of the unit circle forming the Raton border 


of R. These lattices also have mirror symmetry since they form grids of rhombuses, 
the distances from both basis elements 1 and 2 tv to the origin being equal. 


Thus forms with mirror symmetric N give rise to mirror symmetric lat- 
tices. The converse is also true since none of the lattices L(1, œ) with « in the interior 
of R but not on the y-axis have mirror symmetry. One can see this by noting that for 
points « in the interior of R the only points in lattices L(1, œ) of unit distance apart 
lie on horizontal lines, so mirror symmetries of these lattices must take horizontal 
lines to horizontal lines, which forces these symmetries to be reflections across either 
horizontal or vertical lines. The only time such a reflection takes a lattice L(1, œ) to 
itself for some « in the interior of R is when « is on the y-axis, so the lattice is 
rectangular. 

It is interesting to compare the picture of the region R shown earlier with the 
figure in Section 5.5 showing the location of reduced elliptic forms in a triangle inside 
the Farey diagram. Here is this triangle, first as it appeared in Section 5.5 and then 
reflected across a 45 degree line: 


[0,0,1] 
[1,1,1] 
a=b b=0 a=b 
a=C 
[1,0,1] b=0 [0,0,1] [1,0,1] PEF [1,1,1] 


The three sides of the triangle are specified by the equations a = c, a = b, and b = 0, 
so we see that the triangle corresponds exactly to the right half of the region R, with 
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the edge a = b corresponding to the right edge of R, the edge a = c to an arc of the 
unit circle, and the edge b = 0 to the central vertical axis of R. 


A Digression on Hyperbolic Motions 


For negative discriminants the relation of strict equivalence of ideals corresponds 
geometrically to rotation and rescaling of lattices. There is an analogous interpreta- 
tion for positive discriminants but it involves replacing rotations by somewhat more 
complicated motions of the plane involving hyperbolas, as we shall now see. 

What we want is a geometric description of the transformation T, of Q(/A) 
defined by multiplying by a fixed nonzero element y, so T,(«) = ya. For a positive 
discriminant A we are regarding Q(/A) as a subset of the plane by giving an element 
& = a + bV/A the coordinates (x,y) = (a,bVA). The norm N(a) = a? — Ab? is 
then equal to x° — y* and T, takes each hyperbola x° — y* = k to a hyperbola 
x? — y? =N(y)k since N(ya«) = N(y)N(a). 

To picture linear transformations of the plane that take hyperbolas x° — y? =k 
to hyperbolas ya y? =k’ it will be convenient to y 
change the coordinates x and y to X = x+ y and is 
Y = x-y. This changes the hyperbolas x*—y* =k 
to the hyperbolas XY = k whose asymptotes are 


the X-axis and the Y-axis, at a 45 degree angle 


xV 


from the x-axis and the y-axis. Notice that since 

(x,y) = (a, bV/A), the coordinate X = x + y is just 

a+b,/A, the real number « we started with, while Y 
Y =x- y is a — bVA, its conjugate &. 

The transformation T, sends « to ya so T, multiplies the X-coordinate « by y. 
To see how T, acts on the Y-coordinate, observe that since the Y-coordinate of « 
is &, the Y-coordinate of T,(c) is T, (œ) = Ya = y &, so the Y-coordinate of T,(«) 
is y times the Y-coordinate of «. Thus T, multiplies the Y-coordinate by Y, so we 
have the simple formula T, (X,Y) = (yX, VY). 

A consequence of the formula T, (X,Y) = (yX,yY) is that T, takes the X-axis to 
itself since the X-axis is the points (X,Y) with Y = 0. Similarly, T, takes the Y-axis 
to itself, the points where X = 0. In general, a linear transformation that takes both 
the X-axis and the Y-axis to themselves has the form T(X,Y) = (AX, uY) for real 
constants A and u. In particular when u = A7! we have the transformation T (X,Y) = 
(AX,A7!Y) taking each hyperbola XY = k to itself. When A > 1 this transformation 
stretches the X-coordinate by a factor of A and shrinks the Y-coordinate by the same 
factor. Thus each hyperbola XY = k slides along itself in the direction indicated by 
the arrows in the figure above. When A is between 0 and 1 the situation is reversed 
and the Y-coordinate is stretched while the X-coordinate is shrunk. 

When A > 0 and p > O we can rescale the transformation T(X, Y) = (AX, uY) to 
(1/Ap)T(X, Y) = (JA/p X, Vu/AY) whichis a transformation of the type considered 
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in the preceding paragraph, sliding each hyperbola along itself. Thus a transformation 
T(X,Y) = (AX,uY) with A and u positive is a composition of a “hyperbola slide” 
and a rescaling. This is analogous to compositions of rotations and rescalings in 
the situation of negative discriminants. Allowing A or u to be negative then allows 
reflections across the X-axis or the Y-axis as well. If both A and u are negative the 
composition of these two reflections is a 180 degree rotation of the plane. 


Now we specialize to the situation of a transformation T, of Ra given by mul- 
tiplication by an element y in R, with N(y) > 0. The condition N(y) > 0 implies 
that T, preserves the orientation of the plane and also the sign of the norm, so it 
takes each quadrant of the X Y-plane (north, south, east, or west) either to itself or to 
the opposite quadrant. In the former case T, is a composition of a hyperbola slide 
and a rescaling, while in the latter case there is also a composition with a 180 degree 
rotation of the plane, which is just T, for y = —1. The sign of y distinguishes these 
two cases since if y > 0 the transformation T, takes positive numbers to positive 
numbers so the positive X-axis goes to itself, while if y < O the positive X-axis goes 
to the negative X-axis. 


If y is a unit with N(y) = +1 then each hyperbola x° — y? = k is taken to itself 
by T,. The two branches of the hyperbola are distinguished by the sign of X, so if y 
is positive then T, slides each branch along itself while if y is negative this slide is 
combined with a 180 degree rotation of the plane. If we choose y to be the smallest 
unit greater than 1 with N(y) = +1 then the powers y” for integers n lie along the 
right-hand branch of the hyperbola x° — y? = 1, becoming farther and farther apart 
as one moves away from the origin, and T, slides each one of these points along the 
hyperbola to the next one, increasing the X-coordinate. The case A = 12 is shown 
in the first figure below, with Ra = Z[/3]. The unit y is 2 + v3, and the figure 
shows the units +y” for |n| < 2 positioned along the two branches of the hyperbola 


x° = y? = 1, with y = 7 + 4v3 in the upper right corner of the figure. 


N] Va 


A 


= 


For some discriminants there are units y with N(y) = —1 in addition to those with 
N(y) = +1. The transformation Ty for the smallest y > 1 of norm —-1 is a com- 
position of a hyperbola slide and reflection across the X-axis. The powers y” then 
lie alternately on x° — y* = +1 and x? — y* = —1. This happens for example in 
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Z[V2] with y = 1+ V2 as shown in the second figure above, where y* = 3+ 2/2 and 
y>=74+5v2. 

Each ideal in R4 is taken into itself by the transformations T, for y in R4, but 
when y is a unit each ideal is taken onto itself since the inverse transformation GC) 
is just T,-1 which also takes the ideal to itself. Thus all ideals in R, have “hyperbolic 
symmetries”, the hyperbola-preserving transformations T, for units y. 


Although we can describe how the ideals corresponding to properly equivalent 
quadratic forms of positive discriminant are related in geometric terms via hyperbola 
slides and rescaling, the result is somehow less satisfying than in the negative discrim- 
inant case. Hyperbola slides are not nearly as simple visually as rotations, making it 
harder to see at a glance whether two lattices are related by hyperbola slides and 
rescaling or not. This may be a reflection of the fact that hyperbolic forms do not 
have a canonical reduced form as elliptic forms do, making it a little more difficult to 
determine whether two hyperbolic forms are equivalent. 


Exercises 


1. For discriminant A = —23 draw the lattice Lg for one form in each proper equiv- 
alence class of forms. Prove that no two of these lattices have the same shape by 
computing ratios of distances from the origin to nearby points in the lattice, with an 
extra argument to deal with mirror image lattices that do not have the same shape. 


2. Do the same things for A = —39. 


3. (a) Given a lattice L in R, and a nonzero element « in R,, show that there is a 
positive integer multiple n« that is in L. 


(b) Show that the intersection of two lattices in R, is a lattice. 
4. (a) For a form ax? + bxy + cy? of discriminant A we have the associated ideal 


L(a, bev) whose basis a, 2 ame determines a parallelogram P. When A < 0 show 
that P is a rhombus if and only if a = c. 


(b) Give an example of a form ax? + bxy + ay? with A > 0 for which P is nota 
rhombus. 


5. Show that the norm N(L) of a lattice L in Ra can be computed in the following 
way. Choose a basis «, 8 for L and let P, g be the parallelogram with vertices 0, «, 
B, and «+ $. Then N(L) is the total number of points of R, in the interior of Pag 
plus the number of points of R, in the interiors of two adjacent edges of P g, plus 
an additional 1 for the vertex of P, g between these two edges. 


6. Show that if L and L’ are lattices in Ra with L’ a subset of L then N(L) divides 
N(L’). 
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7. Show that the number of lattices in R, of norm n is equal to the divisor sum o (n), 
the sum of all the divisors of n including 1 and n itself. 


8. Show that L(a,./n) is an ideal in Z[,/n ] if and only if a divides n. 


9. (a) We know that if L is an ideal in R, then so is yL for each nonzero y in R4. 
Show the converse, that L is an ideal if yL is an ideal. 
(b) Show that if yL is a principal ideal then so is L. 


10. Find the four ideals in Z[,/—14] of norm 15 and show that only two are principal 
ideals, giving explicit generators for these two. (The relevant topographs are shown 
in Section 6.1.) 


11. (a) For A = 105 determine all the equivalence and strict equivalence classes of 
ideals in R4. 
(b) Do the same for A = 145. 


12. For discriminant A = —64 determine the stabilizers for all the ideals Lg associ- 
ated to reduced forms Q, whether primitive or not. 


13. Show that for each ideal L in R; the stabilizer of L is the same as the stabilizer 
of &L for each nonzero « in Ry. 


14. Show that all ideals in R; are stable if and only if A is a fundamental discriminant. 


8.4 The Ideal Class Group 


An important feature of ideals is that there is a natural way to define a multi- 
plicative structure in the set of all ideals in R,. Thus every pair of ideals L and M 
in Ra has a product LM which is again an ideal in Ra. We will see that this leads 
to a group structure on the set of strict equivalence classes of stable ideals, which, 
under the correspondence between ideals and forms, turns out to be the same as the 
group structure on the class group of forms studied in the previous chapter. If the 
procedure for defining the product of forms seemed perhaps a little complicated, the 
viewpoint of ideals provides an alternative that may seem more obvious and direct. 

In order to form the product LM of two ideals L and M in R; one’s first guess 
might be to let LM consist of all products wf of elements « in L and Bf in M. This 
does not always work, however, as we will see in an example later in this section. 
The difficulty is that for two products o,f, and œ f, the sum o,f, + & B». might 
not be equal to a product af of an element of L with an element of M, as it would 
have to be if the set of all products «f was an ideal. This difficulty can be avoided 
by defining LM to be the set of all sums &ıfı + --- + &,6, with each a; in L and 
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each f; in M. With this definition LM is obviously closed under addition as well as 
subtraction. Also, multiplying such a sum >); «;6; by an element y in R, gives an 
element of LM since y >; XB; = >; y&)6; and the latter sum is in LM since each 
product ya; is in L because L is an ideal. To finish the verification that LM is an 
ideal we need to check that it is a lattice since we defined ideals in R, to be lattices 
that are taken to themselves by multiplication by arbitrary elements of R,. To check 
that LM is a lattice we need to explain a few more things about lattices. 

We defined a lattice in R4 to be a set L(a, fh) of elements xx + yf as x and y 
range over all integers, where « and £f are two elements of R, that do not lie on the 
same line through the origin. More generally we could define L(a,,---,«,,) to be the 
set of all linear combinations x, a, +---+X,, with coefficients x; in Z, where not 
all the «;’s lie on the same line through the origin (so in particular at least two &;’s 
are nonzero). It is not immediately obvious that L(a,,---,«,,) is a lattice, but this is 
true and can be proved by a generalization of the procedure that converts an arbitrary 
basis for a lattice into a reduced basis, as we will now describe. 

There are three ways in which the set of generators a; for L(a,,---,,,) can be 
modified without changing the set L(a,,---,&,,): 


(1) Replace one generator a; with &; + ka,, adding an integer k times some other 
generator &; to Qj. 

(2) Replace some a; by —«;. 

(3) Interchange two generators «; and «;, or more generally permute the «;’s in 
any way. 

After a modification of type (1) each integer linear combination of the new genera- 

tors is also a linear combination of the old generators so the new L(a,,-:--,a,) isa 

subset of the old one, but the process can be reversed by another type (1) operation 

subtracting ka, from the new a; so the new L(c,---,&,,) also contains the old one 

hence must equal it. For the operations (2) and (3) this is also true, more obviously. 


Lemma 8.24. By applying a suitably chosen sequence of operations (1)-(3) to a 


set of generators x; for L(&, >>>, &n) it is always possible to produce a new set 
of generators B,,---,B, which are all zero except for Pı and fp. In particular 
L(&,,°++,Q,) is a lattice. 


Proof: Let us write R, as Z[T] in the usual way. Each œ; can be written as a;+b,T for 
integers a; and b;. We then forma 2 x n matrix co Ea pr) whose columns (5) 
correspond to the «;’s. The operations (1)-(3) correspond to adding an integer 
times one column to another column, changing the sign of a column, and permuting 
columns. 

These three column operations can be used to simplify the matrix until only the 
first two columns are nonzero. To do this we first focus on the second row. This must 
have a nonzero entry since the «,’s are not all contained in the x-axis. The nonzero 


308 Chapter 8 — Quadratic Fields 


entries in the second row can be made all positive by changing the sign of some 
columns. Choose a column with smallest positive entry b;. By subtracting suitable 
multiples of this column from the other columns with positive b;’s we can make all 
other b,’s either zero or positive integers less than b;. This process can be repeated 
using columns with successively smaller second entries until only one nonzero b; 
remains. Switching this column with the first column, we can then assume that b; = 0 
for all i> 1. 

Now we do the same procedure for columns 2 through n using the entries a; 
rather than b;. Since these columns have b; = 0, nothing changes in the second 
row. After this step is finished, only the first two columns will be nonzero. Note that 
neither of these columns can have both entries zero, otherwise L(«,,- ++, &,,) would 
be entirely contained in a line through the origin. o 


Let us restrict attention now to lattices that are ideals. One way to generate 
such a lattice is to start with elements «,,---,a, in Ra which we can assume are 
nonzero and then consider the set of all elements >; y;&; for arbitrary coefficients y; 
in R, rather than just taking integer coefficients as we would be doing for the lattice 


L(&,°++,&,). The usual notation for this set of all sums >); y;&; is (&4,** +3 Qn), 
generalizing the earlier notation (œ) for a principal ideal. The ideal (a,,---,«a,,) 
is equal to the lattice L(&4, Q1 T, &, OT, ***, On: nT) where Ra = Z[T] since each 


coefficient y; in a sum >’; y;&; can be written as x;+y;T for integers x; and y;. Tobe 
sure that (&4,**-,&n) really is a lattice, we should check that 1, 1T, ***, On, OnT 
do not all lie on the same line through the origin. But this is true already for «, and 
&,T since (œ) is an ideal as we saw in the previous section. 

Observe that if a lattice L(a,,---,«,,) is an ideal, then L(a,,---,«,,) is equal 
to (&j,***, Qn) since every product ya; with y in R, can be rewritten as an integer 
linear combination of «,,-:--,&, if L(&1,***, &n) is an ideal. A consequence of this, 
using Lemma 8.24, is that every ideal («,,---,«,,) with n > 2 can be rewritten as an 
ideal (B,, 2). 


Now we return to products of ideals. For ideals L = (a,,a,) and M = (f1, b2) 
the product LM is the ideal (œ f1, &1 b2, 2 b1, &B») since each of the four products 
&;p; isin LM and every element of LM is a sum of terms of for « = yı 0) + Y2Q and 
B = ô bı + 628, So af is a linear combination of the products «; 6; with coefficients 
in Ra. Similarly, the product of ideals (a,,---,a,) and (B,,:--,f,) is the ideal 
generated by all the products «;B;. 

As examples let us compute some products of ideals in Z[/—5], which is R, for 
A = —20. Consider first the ideal corresponding to the form 2x* + 2xy + 3y°, the 
lattice L(2,1 + ./—5). Since this is an ideal it is the same as the ideal (2,1 + /—5). 
Denoting this ideal as P, let us compute its square P* = PP. We have: 


Pe = (2,1 + V=5) (2,1 Pal 5) = (4,2 + 2V=5,6) 
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In this ideal each generator is a multiple of 2 so we can pull out a factor of 2 to 
get P? = 2(2,1 + V—5,3). The ideal (2,1 + V—5,3) contains 3 and 2 so it contains 
their difference 1. Once an ideal contains 1 it must 
be the whole ring, so (2,1 + /—5,3) = (1) = Z[V—5] 
hence P? = 2(1) = (2). The figure at the right shows 
these ideals as lattices, with (2,1 + /—5) indicated [8-8-8] z fg} -<-{s} <8] 
by the heavy dots and its square (2) by the dots in 

squares. Notice that P* is a sublattice of P. In fact 

it is always true that a product LM of two ideals L othe toto isto ito sl 
and M is a sublattice of both L and M since each term of a typical element >); «;f; 
of LM lies in both L and M by the defining property of ideals. 

This example also illustrates the fact that a product LM of two ideals need not 
consist just of all products wf of an element of L with an element of M since the 
number 2 belongs to P° but if we had 2 = xB with « and £ in P then, computing 
norms, we would have 4 = N(«)N(f). There are no elements of Z[,/—5] of norm +2 
since N(x + y—5) = x° + 5y* = +2 has no integer solutions. Thus either « or B 
would have norm +1 and hence be one of the two units +1 in Z[/—5]. However, 
neither 1 nor —1 is in P, otherwise we would have P = Z[,/—5]. 

Continuing with the ring Z[,/—5], we consider next the ideal Q = (3,1 + /—5) 
corresponding to the form 3x? + 2xy + 2y°. For the product PQ we have: 


PQ = (2,1 + V=5)(3,1 + vV=5) = (6,2 + 2V=5,3 + 3V=5, -4 + 2-5) 


The last generator —4 + 2—5 can be discarded since it is the second generator minus 
the first generator. The difference between the second and third generators is 1 +v-5 
so this is in PQ, and these two generators are multiples of 1 + /—5 so we now have 
PQ = (6,1+~/—5). But 6 is in the ideal (1 + /—5) since itis 1—./—5 times 1+ /—5, 
the norm of 1 + /—5, so we have finally PQ = (1 + /—5). 

Next we calculate QQ where the conjugate L of an ideal L = (a, B) is the ideal 
consisting of all the conjugates of elements of L, so L = (@, B). We have: 


QO = (3,1 + V—5)(3, 1 — V—5) = (9,3 + 3V_—5, 3 — 3V-5, 6) 
= 3(3,1 + V=5, 1 - V=5, 2) = (3) 


For the product PP there is no need to do a separate calculation since P = P as one 
can see in the previous figure, so PP = P* = (2). 


[s}-© [8}-0 fa} E of} 0 fa] 


Using these calculations we can see how the two different factorizations of (6) 


in Z[/—5] as (2)(3) and as (1 + /—5)(1 — V—5) arise: 

(6) = (2)(3) = PP-QQ =PPQQ 

(6) = (1+ ¥—5)(1 - V-5) = PQ -PQ = PQPQ 
For the last equality we are using the general identity LM = L M which follows easily 
from the definitions. 
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We defined the norm of an ideal L in Ra geometrically as the number of parallel 
translates of L, including L itself, that are needed to fill up all of R,, and we found 
other ways to view these norms in terms of areas and determinants. For the ideals we 
will be most interested in, the stable ideals in Proposition 8.22, there is yet another 
interpretation of the norm N(L) that is more like the definition of the norm of an 
element & as N(&) = «Q. 


Proposition 8.25. If the ideal L in R, is stable then LL = (N(L)), the principal ideal 
generated by the norm N(L). 


In the preceding example the calculations of PP and QQ are consequences of 
this general result since the norm of an ideal (a,1+/—5) is a. 


Proof: By Proposition 8.16 the ideal L is equal to nL(a, p +a ) for some integer n > 1 


and some form ax? +bxy +cy° of discriminant A with a > 0. It will suffice to prove 
the proposition in the case n = 1 since replacing an ideal L by nL does not affect the 
stabilizer and it multiplies N(L) by nê, so both sides of the equation LL = (N(L)) are 


multiplied by n*. Thus we may take L = L(a, B me) for the rest of the proof. Since 


we assume L is stable, the form ax? + bxy + cy” is primitive by Proposition 8.22. 
Let T = E so T is a root of the equation x° — bx + ac = 0 and TT = ac. We 


have L = (a,T) and L = (a,T). The product LL is then: 
LL = (a°, at,aT, tT) = (a° ,at,aT,ac) = a(a,T,T,C) 


The ideal (a, T, T,c) contains the ideal (a, T + T,c) = (a,b,c). The latter ideal is all 
of R, since it contains all integral linear combinations ma + nb + qc and there is 
one such combination that equals 1 since the greatest common divisor of a, b, and c 
is 1 because the form ax? + bxy + cy? is primitive. (We know from Chapter 2 that 
the greatest common divisor d of a and b can be written as d = ma + nb, and then 
the greatest common divisor of d and c, which is the greatest common divisor of a, 
b, and c, can be written as an integral linear combination of d and c and hence also 
of a, b, and c.) 

Thus the ideal (a,T,T,c) contains R, and so must equal it. Hence we have 
LL = aR, = (a) and this equals (N(L)) since N(L) =a for L = L(a, a. Oo 


Proposition 8.26. An ideal L in R, is stable if and only if there exists an ideal M 
in Ra such that LM is a principal ideal. 


Proof: The forward implication follows from Proposition 8.25 by choosing M = L. 
For the opposite implication, suppose that LM = (a), and let $ be an element of 
Q(/A) such that BL is contained in L. Then B(«) = BLM is contained in LM = (a). 
In particular this says that Ba is in («) so Bax = yx for some element y of R4. Since 
& is nonzero this implies $ = y and so £ is an element of R,. This shows that the 
stabilizer of L is R4, so L is stable. Oo 
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Proposition 8.27. If L and M are stable ideals in R, then N(LM) = N(L)N(M). 


Proof: If L and M are stable then so is LM by Proposition 8.26 since the product of 
two principal ideals is principal. Since LM = LM we have LMLM = LLMM which 
means (N(LM)) = (N(L))(N(M)). We also have (N(L))(N(M)) = (N(L)N(M)) since 
for principal ideals we always have («)(B) = (&ß). Thus (N(LM)) = (N(L)N(M)), 
and this implies N(LM) = N(L)N(M) since if (a) = (b) for positive integers a and 
b then a = b, as is evident from the lattices (a) = L(a,at) and (b) = L(b,brt) for 
Ra = ZIT]. oO 


The formula LL = (N(L)) and the multiplicative property N(LM) = N(L)N(M) 
can fail to hold for ideals with stabilizer larger than R,. A simple example is pro- 
vided by taking L to be the ideal (2,1 + /—3) in Z[/—3] which we considered in the 
previous section, before Proposition 8.22, as an example of an ideal corresponding 
to the nonprimitive form 2x* + 2xy + 2y° of discriminant —12. Here L = L and 
the ideal L? = LT is (2,1 + /—3)(2,1 — V—3) = (4,2 + 2,/—3, 2 — 2/73, 4). Of these 
four generators we can obviously drop the repeated 4, and we can also omit the third 
generator which is expressible as the first generator minus the second. We are left 
with the ideal (4,2 + 2.,/—3) = 2(2,1 + V—3). Thus we have L° = LL = 2L. 


L=L(2,1+V—-3) L? = 2L =L(4,2 + 2-3) (2) = L(2, 2/73) 
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From the figure we see that N(L) = 2 and hence N(2L) = 2*N(L) = 8 so N(L*) + 
N(L)* = 4. This shows that N(LM) need not equal N(L)N(M) in general. Also we 
see from the figure that LI + (N(L)) since LL = L? = 2L + (2) = (N(L)). In fact LL 
is not even a principal ideal since 2L is a lattice of equilateral triangles while principal 
ideals have the same shape as the rectangular lattice Z[/—3]. 


Now at last we come to the construction of the ideal class group, which we will 
denote [CG(A) until we show that it coincides with the class group CG(A) defined 
in Chapter 7 in terms of forms. Let [L] denote the strict equivalence class of a stable 
ideal L in R, and let ICG(A) be the set of such classes [L]. The multiplication 
operation in ICG(A) is defined by taking products of ideals, so we set [L][M] = [LM], 
recalling the fact that the product of two stable ideals is stable by Proposition 8.26. 
To check that this product in ICG(A) is well defined we need to see that choosing 
different ideals L’ and M’ in the classes [L] and [M] does not affect [LM]. This 
is true because [L] = [L’] means aL = a’L’ for some « and a’, and [M] = [M’] 
means BM = B’M’ for some $ and Bf’, hence «BLM = a’ B'L'’M’, so [LM] = [L'M’]. 
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Here we are dealing with strict equivalence classes of ideals so we are assuming all of 
x, B, x, B have positive norms, hence so do of and a’ $p’. (As always this condition 
is automatic when A is negative.) 


Proposition 8.28. ICG(A) is a commutative group with respect to the multiplication 
[L][M] = [LM]. 


Proof: The commutativity property [L][M] = [M][L] is easy since this amounts to 
saying [LM] = [ML], which holds since multiplication of ideals is commutative, LM = 
ML, because multiplication in R4 is commutative. 

To have a group there are three things to check. First, the multiplication should 
be associative, so ({L][M])[N] = [L]([M][N]). By the definition of the product 
in ICG(A) this is equivalent to saying [LM][N] = [L][MN] which in turn means 
[(LM)N] = [L(MN)], so it suffices to check that multiplication of ideals is associa- 
tive, (LM)N = L(MN). The claim is that each of these two products consists of all 
the finite sums >’; «0,6; y; with &;, Pi, and y; elements of L, M, and N respectively. 
Every such sum is in both (LM)N and L(MN) since each term &;ß;y; is in both of 
the ideals (LM)N and L(MN). Conversely, each element of (LM)N is a sum of terms 
(>, %;B;)y so it can be written as a sum >); &;B;y;, and similarly each element of 
L(MN) can be written as a sum >); &;fiyi- Thus we have (LM)N = L(MN). 

Next, a group must have an identity element, and the class [(1)] of the ideal (1) = 
R, obviously serves this purpose since (1)L = L for all ideals L, hence [(1)][L] = 
[L]. There is no need to check that [L][(1)] = [L] as one would have to do for a 
noncommutative group since we have already observed that multiplication in [CG(A) 
is commutative. 

The last thing to check is that each element of ICG(A) has a multiplicative inverse, 
and this is where we use the condition that we are considering only stable ideals in the 
definition of ICG(A). As we showed in Proposition 8.25, each stable ideal L satisfies 
LL = (n) where the integer n is the norm of L. Then we have [L][LZ] = [(n)] = [(1)] 
where this last equality holds since the ideals (n) and (1) are strictly equivalent, the 
norm of n being n’, a positive integer. Thus the multiplicative inverse of [L] is [L]. 
Again commutativity of the multiplication means that we do not have to check that 
[L] is an inverse for [L] for multiplication both on the left and on the right. o 


There is a variant of the ideal class group in which the relation of strict equivalence 
of ideals is modified by deleting the word “strict”, so an ideal L is considered equiv- 
alent to &L for all nonzero elements « of R, without the condition that N(«) > 0. 
The preceding proof that ICG(A) is a group applies equally well in this setting by 
just omitting any mention of norms being positive. Sometimes the resulting group is 
called the class group while ICG(A) is called the strict class group or narrow class 
group. However, for studying quadratic forms the more appropriate notion is strict 
equivalence, which is why we are using this for the class group ICG(A). 
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Next we check that the one-to-one correspondence ®:CG(A) — ICG(A) induced 


by sending a form Q = ax* + bxy + cy* with a > 0 to the ideal Lg = (a, prvi) 


respects the group structures defined on CG(A) and ICG(A). Given two classes 
[Q,] and [Q,] in CG(A), we can realize them by concordant forms [a;, b, apc] and 
[a»,b,a,c] with a, and a, coprime and positive. The product [Q, ][Q.] in CG(A) is 
then the class of [a a>, b,c]. The ideals corresponding to these three forms are L, = 
(a, way L, = (a, a and L, = (ajay, HAY To show that multiplication 
in CG(A) corresponds under © to multiplication in ICG(A) it will suffice to show 
that L,;L, = L}. The product L,L, is the ideal E A Aa a bevA Bay, 
This is certainly contained in L} since the first generator aa» is in L} and the other 


three generators are multiples of 2a by elements of Ra hence are in L}. On the 


other hand L, is contained in L,L, since aa, isin LL, and so is bt VX which can 


2 
be written as a linear combination ma, 24 + na, 2A for some integers m and 
n, using the fact that a, and a, are coprime so we have ma, + na, = 1 for some 
integers m and n. 

The identity element of CG(A) is the class of the principal form [1, b,c] and this 


is sent by ® to the class of the ideal (1, B a ) = (1) which is the identity element of 


ICG(A). The inverse of an element of CG(A) determined by a form [a,b,c] is the 
class of the mirror image form [a, —b,c], so under ® these forms correspond to the 
ideals (a, bya) and (a, bia), The latter ideal is the same as (a, pv) which is 


the conjugate of (a, brva) so it gives the inverse of (a, bea) in ICG(A). 


Thus the group structures on CG(A) and ICG(A) are really the same, and we 
can use the notation CG(A) for both without any conflict. 


To illustrate this let us consider CG(A) for A = —104, so Ry = Z[/—26]. We 
looked at this example in Section 7.2 and found that CG(A) is a cyclic group of order 
6 generated by the form Q, = [5,4,6]. From the topographs we could see that 
Qi was either Q; = [3,2,9] or Q3' = [3,—-2,9], but to determine which, we had 
to find a pair of concordant forms equivalent to Q, and multiply them together. 
Now we can use ideals to do the same calculation. The ideal corresponding to Q; = 
[5,4,6] is (5,2 + V—26) so for Qf the ideal is (5,2 + /—26)(5,2 + V—26) which 
equals (25, 10+5./—26, —22+4,/—26). The next step is to find a reduced basis for this 
ideal. As a lattice this ideal is generated by these three elements and their products 
with /—26. Thus we have the matrix ey = ae E a) which reduces to 
es r) so the ideal is (25,7 + /—26). The corresponding form is [25,14,c] and 
we can determine c from the discriminant equation b? — 4ac = —104 which gives 
c = 3. The form is thus [25,14,3]. A small portion of the topograph of this form is 
shown at the right. There is a source vertex surrounded by the three 42 
values 3,9,10 in counterclockwise order. The form [3,-2,9] has 95 3 
exactly this same configuration at its source vertex, so we conclude jä 10 
that Qi =Q3 | the same answer we got in Section 7.2. i 
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Exercises 

1. Corresponding to a lattice L(a,,---,c,) in Z[T] there is a matrix (p K p”) 
with &; = a; + b;T as in Lemma 8.24. Show that the three operations of adding a 
multiple of one column to another, changing the sign of a column, and permuting 
columns do not change the greatest common divisor of the numbers in each row of 
the matrix. Deduce from this that if a,b + ct is the reduced basis for the lattice then 
c is the greatest common divisor of the entries in the second row of the matrix. 


2. In ZIV-6] compute the powers of the ideal (2,./—6) and determine which powers 
are principal ideals. 


3. In Z[/—14] do the following: 

(a) Compute the square of the ideal (2, ./—14). 

(b) For the ideal L = (3,1 + /—14) find a reduced basis for L? and use this to draw a 
picture of the lattice L°. 

(c) Find nonzero elements « and B in Z[V—1I4] such that aL? = B(2, V—14). 


4. In Z[/—5] compute Q* for Q = (3,1 + V—5) as a principal ideal (cx) after first 
determining what N (œ) must be. 


5. Use the formula N(LM) = N(L)N(M) with L = (&) and M = (@) to give another 
proof that N((c«)) = |N(a)|. 


8.5 Unique Factorization of Ideals 


In this section we will be restricting our attention exclusively to discriminants A 
that are fundamental discriminants, so all forms will be primitive and hence all ideals 
in Ry will be stable. This means that we will be able to make free use of the formulas 
N(LM) = N(L)N(M) and LL = (N(L)). 


Our main goal in this section is to show that all ideals in R4, with the trivial ex- 
ception of R, itself, have unique factorizations as products of prime ideals, where 
an ideal P different from Rj, is called a prime ideal if whenever it is expressed as a 
product LM of two ideals in R4, either L or M must equal R,, so the factorization 
becomes the trivial factorization P = R,P that every ideal has. Note that R,, con- 
sidered as an ideal in R4, satisfies this condition but we do not allow R, as a prime 
ideal, just as the number 1 is not considered a prime number. 

For an element « of R, we know that « is prime if its norm N(«) is prime in Z, 
either positive or negative. The analogue for ideals also holds: 


Proposition 8.29. If the norm N(P) of an ideal P is prime then P is a prime ideal. 
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Proof: Suppose P = LM. Then N(P) = N(L)N(M). If N(P) is prime then since N(L) 
and N(M) are positive integers, one of them must be 1. The only ideal of norm 1 is 
R, so this means L or M must be R,. Thus P is a prime ideal. o 


Proposition 8.30. For each prime p the principal ideal (p) in R, is either a prime 
ideal or it factors as (p) = PP for prime ideals P and P of norm p. 


As we will see later in Corollary 8.34, all prime ideals in R4 are accounted for by 
this proposition, so every prime ideal is either a principal ideal (p) with p prime or 
a factor P or P when (p) = PP. 


Proof: If (p) is not a prime ideal then it factors as (p) = PQ for ideals P and Q not 
equal to R,. Since the norm of (p) is p? we must have N(P) = p and N(Q) = p. 
From the general formula LL = (N(L)) we have PP = (N(P)) = (p). Since N(P) =p 
we must also have N(P) = p so P and P are both prime ideals. (From the unique 
prime factorization property of ideals it will follow that Q = P, but we do not need 
to know this here.) oO 


In the case that (p) = PP the prime p is said to split in Ra. The primes that 
split in R4 are the primes that are norms of ideals in R, , and as we saw in Section 8.3 
these are exactly the primes that are represented by forms of discriminant A. Fora 
split prime p we saw in Proposition 8.18 how to find an ideal P of norm p so this 
now tells us how to factor (p) as PP. 

A further distinction for split primes is whether the two factors of (p) = PP are 
equal or not. If P = P then p is said to be ramified in Ra. According to part (c) of 
Proposition 8.18 the ramified primes are exactly the primes that divide A. 


Now we turn to proving the unique factorization property for ideals in R4. It will 
be helpful to have a criterion for when one ideal L in R, divides another ideal M, 
meaning that M = LK for some ideal K. For individual elements of R, it is easy to 
tell when one element divides another since « divides B exactly when the quotient 
By lies in R4. For ideals, however, the criterion is rather different: 


Proposition 8.31. An ideal L in R, divides an ideal M if and only if L contains M. 


One can remember this as “to divide is to contain”. At first glance the proposition 
may seem a little puzzling since for ordinary numbers the divisors of a number n, 
apart from n itself, are smaller than n while for ideals the divisors are larger, where 
“larger” for sets means that one set contains the other. The puzzle can be resolved 
by interpreting “m divides n” as “the multiples of m contain the multiples of n”. 

The proposition gives some insight into the choice of the ideals P and Q in the 
example preceding Proposition 8.25 where we factored the ideal (6) in Z[,/—5] as 
(2)(3) = PP-QQ andas (1+ /—5)(1—/—5) = PQ-PQ. Since we want PP = (2) and 
PQ = (1+ vV=5), this means that P should divide both (2) and (1 + /—5). By the 
above Proposition 8.31 this is the same as saying that P should contain both (2) and 
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(1 + v-5). An obvious ideal with this property is the ideal (2,1 + /—5). Similarly 
one would be led to try Q = (3,1 + /-—5). Then one could check that these choices 
for P and Q actually work. 


Before proving the proposition let us derive a fact which will be used in the proof, 
a cancellation property of multiplication of ideals: If LM, = LM, then M, = M,. To 
see this, first multiply the equation LM, = LM, by L to get LLM, = LLM). Since 
LL = (n) for n = N(L), a positive integer, we then have (n)M, = (n)M,, which is 
equivalent to saying nM, = nM,. Thus the rescalings nM, and nM, of M, and M, 
are equal, so after rescaling again by the factor 1/4, we get M} = M3. 

Now let us prove the proposition. 


Proof: Suppose first that L divides M,so M = LK for some ideal K. A typical element 
of LK is a sum >), &;ß; with a; € L and £f; € K for all i. Since L is an ideal, each 
term &;f; is then in L and hence so is their sum. This shows that L contains LK = M. 

For the converse, suppose L contains M. Then LL contains ML. Since LL = (n) 
for n = N(L), this says that (n) contains ML, so every element of ML is a multiple 
of n by some element of Ra. This means that if we write ML = (a, B) then we can 
define an ideal K by letting K = (°/,,8/,). 

Now we have (n)K = (n)(%/n, b/n) = (a, B) = ML. Multiplying by L we then 
have (n)KL = MLL = M(n). Canceling the factor (n) gives the equation KL = M, 
which says that L divides M, finishing the proof of the converse. o 


When we proved unique prime factorization for Z and those rings R, that have 
a Euclidean algorithm, a key step was showing that if a prime p divides a product ab 
then p must divide either a or b. Now we prove the corresponding fact for ideals: 


Lemma 8.32. If a prime ideal P divides a product LM of two ideals, then P must 
divide either L or M. 


Proof: We will prove the equivalent statement that if P divides LM but not L, then P 
divides M. Consider the set P +L of all sums «+f of elements « € P and Bf € L. This 
set P+L is an ideal since if P = (a, x) and L = (B,, P2) then P+L = (Q), &, b1, Bo). 
The ideal P +L is strictly larger than P since the assumption that P does not divide L 
means that P does not contain L, so any element of L notin P isin P + L but not P. 
Thus P + L contains P, hence divides P, but is not equal to P. Since P is prime we 
must then have P+L=R,. 

In particular P + L contains 1 so we can write 1 = «+ for some «& € P and 
p € L. For an arbitrary element y € M we then have y = ay + By. The term ay is 
in P since « is in P and P is an ideal. The term fy is in LM since $f is in L and y 
is in M. We assume P divides LM so P contains LM and it follows that By is in P. 
Thus both terms on the right side of the equation y = xy + By arein P so y isin P. 
Since y was an arbitrary element of M this shows that M is contained in P, or in 
other words P divides M, which is what we wanted to prove. o 
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Now we can prove our main result: 


Theorem 8.33. Every ideal in R, other than R, itself is a product of prime ideals, 
and this factorization is unique up to the order of the factors. 


Proof: We first show the existence of a prime factorization for each ideal L + R,. If 
L is prime itself there is nothing to prove, so suppose L is not prime, hence there 
is a factorization L = KM with neither factor equal to Ry. Taking norms, we have 
N(L) = N(K)N(M). Both N(K) and N(M) are greater than 1 since R, is the only 
ideal of norm 1. Hence N(K) < N(L) and N(M) < N(L). By induction on the norm, 
both K and M have prime factorizations, hence so does L = KM. We can start the 
induction with the case N(L) = 2, a prime, hence L is prime. (The case N(L) = 1 
does not arise since L + R4.) 

For the uniqueness, suppose an ideal L has prime factorizations P; ---P, and 
Q; --- Qı. We can assume k < l by a notational change if necessary. The prime ideal 
P; divides the product Q; (Q> - - - Qı) so by the preceding lemma it must divide either 
Q, or Q» Qı. Inthe latter case the same reasoning shows it must divide either Q, 
or Q3---Q). Repeating this argument enough times, we eventually deduce that P, 
must divide some Q,, and after permuting the factors of Q,---Q), we can assume 
that P, divides Q,. When one prime ideal divides another prime ideal they must be 
equal. For if P divides Q then Q = PM for some M, but if Q is prime then either 
P = R,, which is impossible if P is prime, or M = R,, hence P = Q. 

Once we have P) = Q, we can cancel this common factor of P,---P, and 
Q,:::Q) to get P, - - - Pk = Qo » - - Qı. Repeating this process often enough, we even- 
tually get, after suitably permuting the Q;’s, that P4 = Q1, Po = Q2, +++, Pk-1 = Qx_1; 
and P = Qg: Qg. Since P} is prime, as are the Q,’s, the equation P, = Qk- Qı 
can have only one term on the right side, so k = 1 and P, = Qg. This finishes the 
proof of the uniqueness of prime factorizations of ideals. o 


From unique factorization we can deduce that there are no other prime ideals 
beyond those we saw in Proposition 8.30. 


Corollary 8.34. All prime ideals P in R, are factors of ideals (p) for primes p, 
with either (p) =P or (p) = PP. 


Proof: Let P be a prime ideal in R,. We have PP = (N(P)). Writing N(P) as a 
product p,---p, of primes p;, we then have PP = (pı) --- (pp). Thus P divides 
(pı) - (pg) so since P is prime it must divide one of the factors. This means there 
is a prime p such that P divides (p). Proposition 8.30 then finishes the proof. o 


Let us consider how one can find the prime factorization of a given ideal. The 
procedure will be analogous to how we factored Gaussian integers in Section 8.1. We 
begin with an example in the case A = —24 with Ra = Z[V-6]. We looked at this case 
in Section 8.3 when we considered how to find ideals of a given norm. For the norm 
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35 we found the two ideals (35,8 + /—6) and (35,13 + /—6) and their conjugates. 
The prime factors of these ideals will have norms dividing 35, so either 5 or 7, with 
one factor of norm 5 and one of norm 7. We found the ideals of norms 5 and 7, 
which were (5,2 + /—6) and (7,1 + V—6), and we need to see now which of these 
ideals divide (35,8 ++./—6) and which divide (35, 13 + V—6), or in other words, which 
of these ideals contain (35,8 + /—6) and which contain (35,13 + /—6). This will be 
easy using the following general fact: 


Lemma 8.35. A lattice L(a,b + ct) in Z[t] contains another lattice L(a’, b’ + C'T) 
if and only if a divides a’, c divides c', and b' = b/c mod a. 


Proof: For L(a,b + cT) to contain L(a’,b’ + c't) amounts to asking when a’ and 
b’+c't arein L(a,b+cT). For a’, the only integers in L(a,b+cT) are the multiples 
of a, so the condition on a’ is that it must be a multiple of a. For b’ + c't to be 
in L(a,b + ct) means that the equation b’ + c't = ax + (b+ cT)y must have an 
integer solution. Equating the coefficients of T gives c’ = cy which just says that c’ 
is a multiple of c, with y = c/c. Then the equation becomes b’ = ax + b¢/c which 
is equivalent to the congruence b’ = b C/c mod a. o 


Applying this lemma to determine which of (5,2 + /—6) contains (35,8 + /—6) 
we see that the two divisibility conditions are satisfied and the congruence condition 
is 8 = +2 mod 5 where the sign is the same as in (5,2 + /—6). The minus sign gives 
a valid congruence so it is (5, 2—./—6) that divides (35, 8 +Łv—6). For (7, 1 +v=6) to 
divide (35,8 + /—6) the divisibility conditions are again satisfied and the congruence 
condition is now 8 = +1 mod 7 so this time the plus sign is correct so (7,1 + /—6) 
divides (35,8 + /—6). Thus we obtain the prime factorization of (35,8 + v=6) as 
(5,2 — V—6)(7, 1 + /—6). In similar fashion one finds that (35,13 + /—6) factors as 
(5,2 — V—6)(7, 1 — V—6). Taking the conjugates of these two factorizations gives the 
factorizations of the other two ideals of norm 35. 


The general procedure for finding the prime factorization of an ideal L in Ry can 
be described as follows. As an easy first step one finds the largest positive integer n 
dividing each generator for L, assuming L is given in terms of generators. This gives 
a factorization L = nL’ = (n)L’ with L’ a primitive ideal. Factoring (n) into prime 
ideals is done by first factoring n as a product of primes p; and then factoring the 
corresponding principal ideals (p;) as in Proposition 8.30. This reduces the problem 
to the case that L is a primitive ideal. To do this case one computes N(L), say by 
finding a reduced basis for L, then one factors N(L) as N(L) = pj}! +=: pS for distinct 
primes p;. These must be split primes, otherwise L would not be primitive. After 
factoring each principal ideal (p;) as P;P;, one can then determine which of P; or P; 
divides L by applying the preceding Lemma 8.35. Only one of P; and P; can divide L 
since L is primitive, so the prime factorization of L is then obtained from the product 
pi +++ py by replacing each p; by the ideal P; or P; that divides L. 
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Unique prime factorization for ideals can be used to determine the number of 
times each number n appears in a given topograph. Let us illustrate this by returning 
to the case of discriminant —24 where there are the two forms x*+6y* and 2x°+3y°. 
As we saw in Section 8.3, the number of appearances of n for both forms together 
is the same as the number of primitive ideals of norm n. The norms of primitive 
ideals are the numbers n = 243’pj' -- - p with a < 1, b < 1, and the p;’s distinct 
unramified split primes. The primitive ideals of norm n are then obtained by replacing 
the factors 2 and 3 in 273” p/! - - - pi“ by the ideals (2, /—6) and (3, /—6) and each 
pi by either P; t or P ‘ where (p;) = P;P;. Thus there are exactly 2% primitive ideals 
of norm n, so this is the number of times that n appears in at least one of the two 
topographs. We know from Chapter 6 that no number is represented by both forms, 
and the form representing n is x° + 6y* or 2x* + 3y° according to whether the 
character values x3(n) and xg(n) are both +1 or both -1. 


In some cases the unique factorization property for ideals implies unique factor- 
ization for elements of Ra. The relation between the two situations is obtained by 
associating to each nonzero element « in Ry, the principal ideal (œ). Multiplication 
of elements corresponds to multiplication of ideals since (~B) = («)(B). A key ob- 
servation is that («) = (f) if and only if « and £f differ only by multiplication by a 
unit. For if 8 = ex for some unit £ then (£) contains ee! = 1 so (£) = Ra hence 
(B) = (ca) = (E)(&) = (&). Conversely, if (&) = (£) then B isin (œ) so B = ca for 
some € € R4, and similarly & = nf for some n € Ra. Thus « = nf = nex hence 
né = 1 so € and n are units, showing that « and f differ just by a unit. 


Proposition 8.36. If all ideals in Ry, are principal ideals then all elements of Ry 
other than units and 0 have unique factorizations as products of prime elements, 
where the uniqueness is up to order and multiplication by units. 


Proof: This follows immediately from Theorem 8.33 since principal ideals in R4 cor- 
respond exactly to nonzero elements of Ra up to multiplication by units. o 


Proposition 8.37. When A < 0 all ideals are principal if and only if all forms are 
equivalent to the principal form. When A > 0 all ideals are principal if and only if 
all forms are equivalent to either the principal form or its negative. 


Proof: All principal ideals in R4 are equivalent since they are equivalent to R, itself. 
In fact the principal ideals form a complete equivalence class of ideals since any ideal 
thatis equivalent to a principal idealis also a principal ideal by the following argument. 
Suppose an ideal L is equivalent to a principal ideal (œ), so BL = y(&) for nonzero 
elements £ and y of Ra. Then ya isin BL, which means ya = fo for some ô in L, 
and hence we have BL = y(a) = (ya) = (F) = B(6). Thus BL = B(6), so after 
multiplying both sides of this equation by B-! in Q(VA) we have L = (6), a principal 
ideal. 
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To prove the proposition we will use the one-to-one correspondence between 
proper equivalence classes of forms and strict equivalence classes of ideals. The 
principal form has mirror symmetry so forms equivalent to this form are properly 
equivalent to it, and the same holds for the negative of the principal form, which only 
enters the picture when A > 0. 

We distinguish three cases: 


Case 1: A < 0. Here equivalence of ideals is the same as strict equivalence. The prin- 
cipal form has leading coefficient 1 so it corresponds to the principal ideal R,. Thus 
all forms are equivalent to the principal form exactly when all ideals are equivalent to 
R,, or in other words, all ideals are principal. 


Case 2: A > 0 and the principal form is equivalent to its negative. The principal 
form then represents —1 so equivalence of ideals is again the same as strict equiva- 
lence. Thus there is a single equivalence class of forms exactly when there is a single 
equivalence class of ideals, the principal ideals. 


Case 3: A > 0 and the principal form is not equivalent to its negative. These forms 
then give two different equivalence classes of forms, and we will show that they cor- 
respond to two different strict equivalence classes of principal ideals («), those with 
N(«) > 0 and those with N(«) < 0. 

Any two ideals (cx) and (£) with N(«) > 0 and N(f) > 0 are strictly equivalent 
since they are both strictly equivalent to (1). Likewise (œ) and (£) are strictly equiv- 
alent if N(«) < 0 and N(f) < 0 since if y is any element with N(y) < 0, for example 
& or B, then (a) and (£) are both strictly equivalent to (aBy) since N(By) > 0 and 
N(ay) > 0. Now suppose (a) and (£) are strictly equivalent with N(«) and N(B) 
having opposite sign. Then (ya) = (ôf) for some y and ô of positive norm. This 
means we have elements « = ya and f’ = 5B with (a’) = (f’) and such that the 
norms of «’ and f’ have opposite sign. Since («’) = (B’) we have f’ = <a’ for some 
unit £. Since N(«’) and N(f’) have opposite sign we must have N(e) < 0. This 
means that the principal form represents —1 so its topograph has a skew symmetry, 
making it equivalent to its negative, contrary to hypothesis. Thus we have shown the 
the equivalence class of principal ideals (œ) splits into two strict equivalence classes 
according to the sign of N (a). 

Now we show that the negative of the principal form corresponds to a principal 
ideal (x) with N(«) < 0. The principal formis x?-dy* if A = 4d and x? +xy-dy?° 
if A = 4d +1. The negative of the principal form has leading coefficient —1 so to 
find the corresponding ideal as in Theorem 8.21 we first have to choose a properly 
equivalent form with positive leading coefficient. For this we can choose dx? - y? or 
dx* + xy — y*, obtained from the negative of the principal form by replacing x, y 
by —y,x, rotating the topograph by 180 degrees. For dx? — y° the associated ideal 
is L(d, Vd) which is the principal ideal (vd) since d = vd -vd so d is an element of 


(vd). For dx? +xy-y° the corresponding ideal is L(d, Livia) which is (444) since 
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d = 44 . 14 In both cases the norm of the element vd or +4 generating the 
ideal is —d so it is negative. 
Thus in Case 3 the two strict equivalence classes of principal ideals correspond 


to the equivalence classes of the principal form and its negative, so these are the only 
two equivalence classes of forms exactly when all ideals are principal. o 


An example for the third case in this proof is A = 12 where the class number 
is 2 corresponding to the principal form x° — 3y° and its negative. The primes 
represented in discriminant 12 are 2, 3, and the odd primes p with Legendre symbol 
(+) = () = (=) (£) = +1 so these are the primes p = +1 mod 12. The two 
forms are of different genus, with x° — 3y? representing primes p = +1 mod 12 
and -x° + 3y° representing primes p = —1 mod 12. By Proposition 8.7 the primes 
p that factor in Ra = Z[V3] are the primes represented by either of the two forms, 
for example 2 = (/3 + 1)(V3— 1), 3 = (V3)*, 11 = (2v3 + 1)(2v3 - 1), and 13 = 
(4+ /3)(4— 3). Here the factorization of 11 comes from the value —11 in the + /% 
regions in the topograph of the principal form while the factorization of 13 comes 
from the 13 in the +4 regions. 


Q(x, y) = x?-3y? 


-2 
2 
1 


In this example prime factorizations are unique up to units, but there are infinitely 
many units for positive discriminants so there can be many factorizations that look 
rather different but are obtained just by inserting units. For example the topograph 
also gives 13 = (5 + 2/3) (5 — 2v3) from the *°/ regions so 5 + 2V3 must be a unit 
times either 4 + v3 or 4 — V3. One can determine which by computing which of the 
two quotients (5+2./3)/(4+ v3) and (5 +2v3)/(4- v3) lies in Z[/3]. One finds that 
the latter quotient is the unit 2 + /3 so 5 + 2/3 = (2 + v3) (4 — V3). In terms of the 
topograph, multiplication by the fundamental unit 2 + v3 translates the topograph by 
one period to the right, while conjugation is reflection across the vertical line through 
the Y and % regions. So to get from 4/; to > we first reflect 4/4, to 74⁄4, then we 
translate by one period to get °/. 

As this example shows, for prime factorizations it makes little difference if the 
principal form is not equivalent to its negative since changing the sign of an element 
of Ra is just multiplying it by the unit —1. The issue could be avoided entirely by 


322 Chapter 8 — Quadratic Fields 


using the version of the ideal class group based on equivalence of ideals rather than 
strict equivalence. 


Let us conclude this section with some comments on what happens when the 
discriminant A is not a fundamental discriminant. One might hope that the unique 
factorization property for ideals still holds at least for stable ideals, the ideals corre- 
sponding to primitive forms. However, this is not the case, and here is an example. 
Take A = -12, so Ry, = Z[/—3]. The class number is 1 in this case so all stable 
ideals are principal (and recall that principal ideals are always stable). Consider the 
factorizations (4) = (2)(2) = (1+ /—3)(1 — V—3). The ideals (2) and (1 + /—3) are 
prime since their norms are 4 so any nontrivial factorization as («)(6) would have 
N(a) = N(B) = 2 but no elements of Z[,/—3] have norm 2 since xê + 3y* = 2 has 
no integer solutions. The three ideals (2) and (1 + —3) are distinct since the only 
units in Z[/—3] are +1. Thus we have two different factorizations of (4) into prime 
ideals when we restrict attention just to stable ideals. If one drops this restriction 
then unique prime factorization still fails since for the ideal L = (2,1 +-/—3) we saw 
in the discussion following Proposition 8.27 that L* = 2L, but unique factorization 
implies the cancellation property so we would then have L = (2), which is false. 

One might ask where the proof of unique factorization breaks down for stable 
ideals in the case of anonfundamental discriminant. The answer is in the key property 
in Lemma 8.32 that if a prime ideal P divides a product LM then it must divide one of 
the factors L or M. In the proof of this we considered the ideal P +L, but unfortunately 
this need not be a stable ideal when P and L are stable. For example, in the preceding 
paragraph if we take P = (2), L = (1+ /—3), and M = (1 — v=3) then P + L is the 
ideal (2, 1+ /—3), but this is not stable as we saw after Proposition 8.27. And in fact 
the ideal (2) does not divide either (1 + /—3) or (1 — V—3). 


Exercises 


1. (a) Find the ideals of norm 39 in Z[./10] and find the factorizations of these ideals 
into prime ideals. 
(b) Do the same for the ideals in Z[./10] of norm 10, 15, and 30. 


2. Let p1, P2, P3, P4 be distinct primes represented by the form 2x? +37. Show that 
there is an element of Z[,/—6] of norm p,p.p3p4 having three different factorizations 
as products of prime elements of Z[,/—6], where factorizations that differ just by units 
are not regarded as different factorizations. 


3. For a fundamental discriminant A let us define two ideals L and L’ in Ry, to be 
scale equivalent if there exist positive integers m and n such that mL = nL’. Show 
that the set of scale equivalence classes of ideals in R4 forms a group with respect to 
the usual multiplication of ideals, and determine the structure of this group. 
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8.6 Applications to Forms 


As we have seen, ideals provide an alternative way of constructing the class group 
CG(A). One of the main uses of the group structure in CG(A) in Chapter 7 was in 
Theorem 7.7 which characterized the primitive forms of discriminant A representing 
a given number n in terms of the forms representing the prime factors of n, or prime- 
power factors in the case of primes dividing the conductor. When A is a fundamental 
discriminant the same characterization can be derived from the unique factorization 
property of ideals in Ra. This viewpoint provides additional insights into the some- 
what subtle answer to the representation problem. Here is a restatement of the result 
we will now prove using ideals: 


Theorem 8.38. Let A be a fundamental discriminant and let n > 1 be a number 
represented by at least one form of discriminant A. If the prime factorization of 
n isn = p\'---p,* for distinct primes p;, with e; = 1 for each p; dividing A 
and e; = 1 otherwise, then the forms of discriminant A representing n are exactly 
the forms Q% - - - Q%®™ where Q; represents p; and the product Q;*'---Q,°" is 
formed in the class group CG(A). 


There are a few facts that are used in the proof that we will explain in advance 
to avoid complicating the later arguments. The first is the elementary fact that an 
element « in R, belongs to an ideal L if and only if the ideal (œ) factors as (&) = LM 
for some ideal M. This is because « is an element of L exactly when the ideal (œ) is 
contained in L, or in other words, when L divides (œ), which means (œ) = LM for 
some ideal M. 


Next is a reformulation of what it means for a form Q; to represent a number n. 
By definition, Q,;(«) = N(«)/N(L) for « in L. Thus if we choose a basis &, &> 
for L regarded as a lattice and we let « = xa, + ya, for integers x and y, then 
Q, (x,y) = N(xa, + ya&>)/N(L). For this to give a representation of n means that 
x and y are coprime. In terms of « this is saying that « is not a multiple mf of 
any element £f of L with m > 1. This last condition can be abbreviated to saying just 
that « is primitive in L. 

We have also defined what it means for an ideal L to be primitive, namely, L is not 
a multiple mL’ of any other ideal L’ with m > 1, or equivalently, L is not divisible 
by any principal ideal (m) with m > 1. We could require m to be a prime without 
affecting the definition since if L = mL’ with m = pq for p a prime then L is p 
times the ideal gL’. By Proposition 8.16 every ideal in R, is equal to nLg for some 
integer n > 1 and some form Q of discriminant A, so the primitive ideals are exactly 
the ideals Lo. 
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An equivalent way of formulating the condition for L to be primitive is to say that 
the factorization L = P} - - - P, as a product of prime ideals satisfies the following two 
conditions: 


(1) No P; is a prime ideal (p) with p a prime integer. Thus each P; has norma prime 
rather than the square of a prime. 


(2) There is no pair of factors P; and P; with i + j such that P; = Pi In particular 
if P; = P; then P; can occur only once in the prime factorization of L. 


Proof of Theorem 8.38: Suppose that a number n > 1 is represented by a form Q. 
From the correspondence between proper equivalence classes of forms and strict 
equivalence classes of ideals we may assume Q = Q, for some ideal L. Thus we 
have n = Q, (a) = N(«)/N(L) for some primitive « in L. Since n and N(L) are 
positive, so is N(a). 

We can reduce to the case that « is a positive integer by the following argument. 
We have n = N(a)/N(L) = N(&ax)/N(QL) = Qy, (Hx). The element a of GL is 
primitive in XL since if «ax = gap for some positive integer q and some $£ in L, then 
& = qf which forces q to be 1 since « is primitive in L. The integer m = Qa is 
N(«) which is positive as noted above. Also, m isin WL since « isin L. The ideals L 
and XL are strictly equivalent since N(&®) = N(«) > 0, so the forms Qz, and Q; are 
properly equivalent. This shows that we may take n to be represented as n = Q; (m) 
for some primitive positive integer m in the new L. 

Next we reduce to the case that L is a primitive ideal. If L is not primitive we can 
write it as L = qL’ for some integer q > 1 with L’ primitive. Since m is in L = qL’ we 
have m = qr for some r in L’, and in fact r must be an integer since r = m/q and 
the only rational numbers in R, are integers. Since m and q are positive, so is r. 
Also, r is primitive in L’ since m is primitive in L and we are just rescaling m and 
L by a factor of 1⁄7 to get r and L’. The equation n = N(m)/N(L) can be written as 
n = N(qr)/N(qL') = N(r)/N(L’) since qL’ = (q)L’ and N((q)) = N(q). This shows 
that n is represented as Q,,(r) =n. The form Q,, is properly equivalent to Q; since 
L = qL' and N(q) > 0. The net result of this argument is that we can assume that n 
is represented as n = Q,;(m) = N(m)/N(L) where L is primitive and m is a positive 
integer that is a primitive element of L. 

Since m is in L we have (m) = LM for some ideal M. This M must also be 
primitive, otherwise if M = qM’ for some ideal M’ and some integer q > 1, then, 
arguing as in the preceding paragraph, we would have m = qr for some positive 
integer r in M’ with (r) = LM’. This last equality implies that r is in L, so m would 
not be primitive in L. 

Since L and M are both primitive, their factorizations into prime ideals satisfy the 
earlier conditions (1) and (2). Then since their product is (m) with m an integer, we 
must have M = L. Thus (m) = LL andso m = N(L). Nowwe have n = N(m)/N(L) = 
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m*/m = m so n = m and the representation of n becomes n = Q,(n) with L 
primitive and n = N(L). 

Let the factorization of L into prime ideals be L = P,---P,. Then N(P;) is a 
prime p; and p; is in P; since P,P; = (p;). Also, p; is primitive in P; since p; is 
prime so if p; was not primitive in P; then P; would contain 1 which is impossible 
since P; + Ry. If we denote Qp, by Q; for simplicity then Q; represents p; since 
Qi(pi) = N(pi)/N (Pi) = pi Ipi = Pi. 

Since n = N(L) and L = P} --- P, we have n = p; --- pg. The prime factorization 
n = pı- pg is unique so the prime ideals P; are uniquely determined by n up to 
the ambiguity of replacing P; by P;. In CG(A) this amounts to replacing Q; by Q7 a 
Keeping in mind condition (2), we have now shown that if a form Q represents n then 
in CG(A) we have Q = Qj*'---Q;,° where n = p{'-- - p} is the factorization of 
n into powers of distinct primes p; and the form Q; represents p;. Condition (2) 
implies that e; = 1 for each i with P; = P;, that is, for each p; that divides the 
discriminant A. 

To show the converse, suppose n = pj':- pe is the factorization of n into 
powers of distinct primes p; with e; = 1 when p; divides A, and suppose the form 
Q; represents p;. Our objective is then to show that Qi” e. Oo. represents n. By 
the arguments in the first part of the proof applied to p; in place of n there is an ideal 
L; containing p; with N(L;) = p;, so L; is a prime ideal since its norm p; is prime. 
If we set L = L{'---L,* then L is primitive since its factorization into prime ideals 
satisfies conditions (1) and (2). We have n € L since each p; is in L;. Also we have 
N(L) = N(L,)®% «+ +N(L,)* = p? - ++ py =n. Thus Q,;(n) = N(n)/N(L) = n*/n = 
n which means that Q; represents n provided that n is primitive in L. If n is not 
primitive in L then it factors as n = qr for some integer q > 1 and some r in L. By 
an earlier argument r must be a positive integer. Since r is in L, we have (r) = LM 
for some ideal M. Then (n) = (qr) = qLM. We also have (n) = LL since N(L) =n. 
Thus qLM = LL so the cancellation property for ideals implies that L = qM. Taking 
conjugates, this says L = qM. This contradicts the fact that L is primitive. Thus we 
have shown that Qz; represents n. 

We have Q, (pi) = N(p;)/N(L;) = p? Ip; = p;. Thus both Q; and Qz, represent 
the prime p; so they must be equivalent, hence in CG(A) we have Q,, = ae We 
can choose the sign of the exponent at will since we are free to replace L; by L; in the 
previous arguments. Then QI --- Q% = Qj) +++ Qik = Q; since L = LP -+ Ly. 
Thus Qj*!---Q,°* represents n since Q; represents n Oo 


As another application of unique factorization for ideals in the rings R, for fun- 
damental discriminants A let us consider again the problem of finding which primitive 
forms represent powers of primes dividing the conductor in the case of nonfundamen- 
tal discriminants. The large table in Section 6.2 shows some of the subtleties that can 
occur for small negative nonfundamental discriminants. Perhaps the most interesting 
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cases are when infinitely many different powers of these primes are represented. The 
first three cases A = —28, —60, and —72 were treated in Sections 6.2, 6.3, and 8.2. 
Let us consider now the fourth case A = —92 where there are some new subtleties. 

For A = —92 the class number is 3 with the three forms x° + 23y? and 3x* + 
2xy + 8y*. The associated fundamental discriminant is A = —23 which also has 
class number 3, corresponding to the forms x? + xy +6y° and 2x*+xy+3y". The 
conductor is 2 and this is represented in discriminant —23 by 2x*+xy+3y*,as are 
all powers of 2 since 2 does not divide —23, so by Proposition 6.13 all powers 2* for 
k = 3 are represented by at least one of the forms x° + 23y* and 3x7 +2xy + 8y°. 
Our aim is to determine which of these powers are represented by each form. 


First consider the form x° + 23y°. For elements x + /—23y in Z[/—23] we 
have N(x + /—23y) = x? + 23y° so we are looking for coprime integers x and y 
such that x + /—23y has norm a power of 2. We will use the larger ring Z[w] with 
w = (1+ /—23)/2 since this has unique factorization of ideals, being the ring Ry for 
the associated fundamental discriminant —23. Using Proposition 8.18 we see that the 
principal ideal (2) in Z[w] factors as (2) = PP for P = (2,w), with P + P. Since 
N(2) = 4 we have N(P) = N(P) = 2, so N(P*) = 2‘. The ideal P is not principal 
since there is no element of Z[w] of norm 2, for if œ in Z[w] had norm 2 then 2« 
would be an element of Z[,/—23] of norm 8 but the form x° + 23y? does not take 
on the value 8. Since the class number for discriminant —23 is 3 the class group is 
cyclic of order 3 and P generates this group. Thus the powers of P that are principal 
ideals are the powers P?”. 

Suppose the element « = x + /—23y of Z[,/—23] has norm 2*, so «x = 2*. 
Then for ideals we have (cx) (®) = PYP“ and hence (œ) = P’P’ for some r and s with 
r+s=k. We have x? + 23y7 = 2* so x and y have the same parity. We want them 
to be coprime so this means they are both odd and hence « is divisible by 2 in Z[w]. 
This is saying that («) is divisible by both P and P since (2) = PP. Thus r > 0 and 
s > 0. On the other hand if r > 1 and s > 1 this would say that («) was divisible 
by (4) and hence « was divisible by 4 in Z[w], so x and y would both be even, a 
contradiction. Therefore one of r and s must be 1, and so in the class group where 
P is the inverse of P the ideal (œ) must be either 2P*~* if s = 1, or opr? ifr=1. 
Since (œ) is a principal ideal this implies that k — 2 is a multiple of 3, say k- 2 = 3m, 
or k = 3m + 2. Thus the only powers of 2 that could possibly be represented by 
x* + 23y? are the powers 2* with k = 2,5,8,---. Obviously 2° is not represented 
so this leaves 2°, 2°, 2!!,---. as the only possibilities. 

The other two forms 3x? +2xy+8y° are equivalent, though not properly equiv- 
alent, so they represent the same numbers. We will show that they cannot represent 
any of the powers 2°, 2°, 2!!,---. Since each power 2* with k = 3 is represented by 
one of the forms x° + 23y* and 3x? + 2xy + 8y? we will then know that x? + 23y? 
represents 2°,2°,2!!,..- and 3x* +2xy+8y* represents 27,27, 2°, 27,29, 21°... 
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The lattice in Z[/—23] corresponding to 3x* + 2xy + 8y* is L(3,1 + V-23). 
This has norm 3 in Z[/—23] so we have N(3x + (1 + /—23)y)/3 = N((3x + y) + 
J=23y)/3 = (9x? + 6xy + y? + 237) /3 = 3x? + 2xy + By", the given form. 

Suppose that x and y are coprime integers for which 3x7 + 2xy+8y" = 2* The 
element « = 3x + (1 + /—23)y = 3x + 2wy in Z[/—23] then has N(«) = 3-2*. In 
Z[w] we have (2) = PP for P = (2,w), and we have (3) = QQ for Q = (3,w) from 
Proposition 8.18. Thus («)(®) = QapkP* and hence («) is either QPP or QP’P 
for some integers r > 0 and s = 0 with r +s =k. The equation 3x? +2xy +8y° = 2K 
implies that x is even, hence 3x + 2wy is divisible by 2 in Z[w]. This implies 
that r > 0 and s > 0. If r > 1 and s > 1 then 4 divides 3x + 2wy in Z[w] 
which implies x and y are even, violating their coprimeness. Thus either r = 1 
or s = 1, say s = 1. This means (x) = 2QP** or (x) = 2QP*~*. Since («) is a 
principal ideal this means that QP? or QP? is a principal ideal. The product PQ 
is (2,w)(3,w) = (6€,2w,3w, w°?) with w* = w — 6. It follows that PQ = (w) since 
w = 3w — 2w and 6 = wW. Since PQ is a principal ideal, Q is the inverse of P in 
the class group and Q is equivalent to P. 

In the case (a) = 20Pp** the ideal (c«) is principal and is equivalent to P*-3 in the 
class group so k—3 = 3n for some integer n, which means k = 0 mod 3. In the other 
case (&) = 2QPk? we have (œ) equivalent to P*-! in the class group sok-—1=3n 
and k = 1 mod 3. This finishes the argument that the forms 3x? +2xy+8y* cannot 
represent any of the powers 2°, 2°, 2!!,---. Hence we know which powers of 2 each 
form x? + 23y° and 3x? + 2xy + 8y” represents. 

It is easy to be more explicit about representing 2°"** by x? + 23°. As we have 
seen, this amounts to writing the principal ideal 2P?” as (x + /—23y). The ideal P’ 
has norm 8 so it must equal (£) for some $ in Z[w] of norm 8. From the topograph 
of the norm form x° + xy + 6y° in discriminant —23 one can see that 1 + w and 
1+@ = 2 — w are the only elements of Z[w] of norm 8, up to sign. Thus we obtain 
solutions of x° + 23y* = 23"** by writing 2-(1+w)" as x + /—23y, and these are 
the only primitive solutions, up to changing the signs of x and y. We can compute 
inductively, so if 2. (1 + w)” =x + /—23y then multiplying this by 1 + w gives the 
solution for the next value of n. Since 1+ w = 3-23 the inductive formula is: 


Geary a oes 


Here are the first few solutions: 


(x,y) | (3,1) (7,3) (45, 1) (79, —21) (123; =71) 


One could also be explicit about solutions of 3x? + 2xy + 8y? = 2* but the 
answers are a little more complicated so we will not do this here. 
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1. Forms of Negative Discriminant 


This table lists the proper equivalence classes of primitive forms for each negative 
discriminant down to —120. The first column gives the discriminant (up to sign), with 
an asterisk when it is not a fundamental discriminant. The second column gives the 
class number. In most cases in the table the class group is cyclic so the class number 
determines the class group. The exceptions are indicated by writing the class number 
as a product corresponding to the factorization of the class group as a product of 
cyclic groups. Thus 2-2 means class number 4 with class group the product of two 
cyclic groups of order 2. The third column gives the various characters for each 
discriminant. These correspond to the prime divisors of the discriminant, with a few 
exceptions for the prime 2 in cases with nonfundamental discriminants. The fourth 
column gives the reduced form for each equivalence class, with ax? + bxy + cy? 
abbreviated to [a,b,c], followed by signs + and — indicating whether the characters 
have value +1 or —1 on each form. The forms in each genus have the same character 
values, and these forms are listed consecutively. Forms that lack mirror symmetry 
have middle coefficients +b, indicating that the form and its mirror image give distinct 
elements of the class group. 


Forms Forms 
[1,1,1] + [1,1,6] + 
[1,0,1] + [2, +1,3] + 
[1,1,2] + [1,0,6] ++ 
[2,0,3] -- 
[1,0,2] + 
[1,1,7] + 
n Lal + a 
[1,0,3] + 
[1,1,8] + 
[1,1,4] ++ [2,+1,4] + 
EE Ei abs [1,0,8] ++ 
[1,0,4] + [3,2,3] Se 
[1,1,5] + [1,1,9] ++ 
[1,0,5] ++ [3,1,3] -- 


[2,2,3] ——- [1,0,9] ++ 


[2,2,5] +- 


|A| iha] Char. Forms 


39 | 4 | X3 X43 |[1,1,10] ++ 
[3,3,4] ++ 
[2,+1,5]- - 
40 Xg X5 -N 
[2,0,5] = 
TEER T + 
* 44 | 3 Xi [1,0,11] 
[3, +2,4] 
47 [1,1,12] 
[2, +1,6] 
ere 
* 48 X4 X3 |[1,0,12] 
[3,0,4] -+ 
51 X> Xız | [1,1,13] ++ 
[3,3,5] = 
52 X4 Xı3 | [1,0,13] ++ 
4 | X; Xi [1,14] ++ 
[4,3,4] ++ 
[2,+1,7]-—- 
Xg Xz |[1,0,14] ++ 
[2,0,7] ++ 
[3,+2,5]-—- 
59 [1,1,15] + 
[3,+1,5] + 
* 60 X3 X5 eee 
[3,0,5] — 
*63 | 4 | X3 X7 ae 
[4,1,4] ++ 
[2,+1,8]-+ 
* 64 X4 Xg a 
[4,4,5] +- 


++ +/+ + 


67|1| Xe o + 
68 | 4 | X4 Xız |[1,0,17] ++ 
[2,2,9] ++ 
[3,+2,6]-—- 
71/7] Xa |01,1,18] + 
[2,+1,9] + 
[3,+1,6] + 
[4, +3,5] + 
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|A| 
* 72 


* 80 


83 


84 


87 


88 


91 


* 92 


95 


* 96 


* 99 


ha 


Char. 


eaa 
aaa 


X4 X5 


22) eRe Rs 


X3 X29 


H 


X5 X19 


B22) TE ET 


X3 X1 


Forms 


[1,0,18] 
[2,0,9] 
[1,1,19] 
[3,3,7] 
[1,0,19] 
[4, +2,5] 
[1,1,20] 
[2,+1,10] 
[4, +1,5] 
[1,0,20] 
[4,0,5] 
[3, +2,7] 
[1,1,21] 
[3, +1,7] 
[1,0,21] 
[2,2,11] 
[3,0,7] 
[5,4,5] 
[1,1,22] 
[4, +3,6] 
[2,+1,11] 
[3,3,8] 
[1,0,22] 
[2,0,11] 
[1,1,23] 
[5,3,5] 
Ea. 0,23] 
EA 
[1,1,24] 
[4, +1,6] 
[5,5,6] 
[2, +1,12] 
[3,+1,8] 
[1,0,24] 
[3,0,8] 
[4,4,7] 
[5,2,5] 
[1,1,25] 
[5,1,5] 
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+++ 


-++ 

+—— 
++ 
-+ 
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|A| "a Char. Forms 
X4 X5 [1,0,25] ++ 
[2,2,13] +- 
Kros [1,1,26] + 
[2,+1,13] + 
[4, +3,7] + 


Xg X43 [1,0,26] ++ 
[3,+2,9] ++ 
[2,0,13] -- 
[5,+4,6] -- 
X 107 [1,1,27] + 

[3, +1,9] + 

* 108 ha X 3 [1,0, 27] + 
[4, +2,7] + 


111 X3, X37 [1,1,28] ++ 
[4,+1,7] ++ 
[3,3,10] ++ 
[2,+1,14] --— 
[5,+3,6] -- 
*112]| 2 X3 X7 [1,0,28] ++ 

[4,0,7] —+ 


115] 2 Xs, X23 [1,1,29] ++ 
[5,5,7] TOR 


116 X4, X29 [1,0,29] ++ 
[5,+2,6] ++ 
[2,2,15] -- 
[3,+2,10] --— 
119 | 10 Xz, Xı7 [1,1,30] ++ 
[2,+1,15] ++ 
[4, +3,8] ++ 
[3,+1,10] -- 
[5,+1,6] -- 
[6,5,6] —— 


120 |2-2| Xg, X3, X; | [1,0,30] 
[2,0,15] 
[3,0,10] - +- 
[5,0,6] --+ 


+ + 
| + 
| + 
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Listed below are the 101 known negative discriminants A for which every prim- 


itive form has a mirror-symmetric topograph. This is equivalent to saying that each 


genus consists of a single equivalence class of forms, or that the class group is either 


the trivial group or a product of cyclic groups of order 2. The class number h, is then 


a power of 2 determined by the number of distinct prime divisors of A. Asterisks in 
the table denote nonfundamental discriminants. Among the 101 discriminants there 


are 65 fundamental discriminants and, coincidentally, 65 even discriminants. 


|A| 


x*96=2°”.3 
* 99 = 37-11 
* 100 = 2? . 5? 
*112=24.7 
115 =5-23 


|A| 

120 =2°?-3-5 
123 =3-41 
132 =2°-3-11 
* 147 =3-49 
148 =4- 37 

* 160 =2°-5 
163 

168 = 23.3.7 
* 180 =2°.3?.5 
187 =11-17 
* 192 = 2°.3 
195 =3-5-13 
228 = 2°-3-19 
232 = 23.29 
235=5-47 

* 240 = 24.3-5 
267 = 3-89 
280 = 23.5.7 
* 288 = 2” . 3? 
312 = 28.3.13 
* 315 =3°-5-7 
340 = 27-5-17 
* 352=2°-11 
372 = 27 .3-31 
403 = 13-31 
408 = 2° -3-17 
420 = 2°-3-5-7 
427=7-61 
435 =3-5-29 
* 448 = 28.7 

* 480 = 2? -3-5 
483 =3-7-23 
520 =2°?-5-13 


> 
> 
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[Al 


555 =3-5-37 
595 =5-7-17 
627 =3-11-1 
660 = 27 -3-5 
* 672 =2°-3-7 


9 
-11 


708 = 2? -3-59 


715=5-11-13 
760 = 22-5-19 
795 =3-5-53 
840 = 2°?-3-5-7 
* 928 = 2° - 29 

* 960 = 2°-3-5 
1012 = 2? . 11 -23 
1092 = 2?.3.7-13 
* 1120 =2°-5-7 
1155=3-5-7°ll 
x 1248 = 2°-3-13 
1320 = 28.3.5.-11 
1380 = 2° 3 -5.23 
1428 = 2? .3.7.-17 
1435=5-7-41 
1540 = 2765+ 7611 
x 1632 = 2°-3-17 
1848 = 2? -3-7-11 
1995 =3-5-7-1l 
x 2080 = 2°-5-13 
3003 =3-7-11-13 
* 3040 = 2°-5-19 
3315 =3-5-13-17 
* 3360 =2°-3-5-7 
* 5280 = 2?-3-5-11 
5460 = 2°-3-5-7-13 
* 7392 =2°-3-7-11 


> 
> 


m m œ 0 00 0 0 00 00 00 M 0 00 00 00 0% 0 0 RO A œ A A AR BR œ œ e e e 


16 
16 


332 Tables 


3. Forms of Positive Nonsquare Discriminant 


This table is similar in layout to Table 1. For positive discriminants there is not a 
unique reduced form within each equivalence class so we have chosen a form which 
seemed simplest in some less precise sense. 


A ra Char. |Forms A [ha Char. Forms 


T, G <= [14,0,-1] -- 
0.41), = aie) sam 


EER on fee E 60 | 2:2} X4 X3 X5 zi Ge ee 
[15,0,-1] --+ 
Valea a = $ 20.28) Si 
A [1,0,-5] + [5,0, e. OR 
21 X3 F - l; 5 ++ _ 61 Ea X61 [1,0,-15] + 
[6,0,-1] -- ¥*68 Ee noe Eoi = 
28 2 X4 X7 [1,0,-7] TT 69 X3 X 23 [1, 1, me ++ 
[7,0,-1] -- [17,1,-1] -- 
ae Ce wee eee HRJ [1,0, 8 E 
* 32 X4 Xg |[1,0, a ++ [18,0,-1] +- 
Rosme SE a [L18] + 
33 X> X1 | [1,1,- ++ 76| 2 X4 X19 [1,0,-19] ++ 
[8, 1, Ee -— [19,0,-1] -- 
37 ES Kas ier. 4 77 Me fer 1op ae 
40 Xg Xs |[1,0, a ++ [19,1,-1] -- 
20,0,-1] -+ 
41 L1, a J | 
ere x 84 X3 X7 [1,0, a ++ 
r oA aS 
l l 85 Xs Xı7 [1, 7 a ++ 
*45 | 2| X3 Xs |[1,1,-11] ++ = 
3 X5 RARS — 
Lilie Mee gg Roce io 7 F 
* 48 HA [1,0,-12] ++ [22.0,-1] =- 
[Le Ove, 86 = Ries fea eee 
«52/1 | Xiy Mois 92 MO IEO nn 
53] 1 X 53 [1,1,-13] + [23,0,— ea 
X> X31 [1,1, = ++ 
(een: aa 


hy Char. Forms 
x96 |2-2|X4 Xg X3] [1,0,-24] 
[24,0,-1] 
[3,0, -8] 
[8,0, -3] 
97 a X 97 [1,1,—24] 
w E ea, e 


104). 2| Xg Xis | T0261 
[2,0, -13] 
105'}2-2 | Xe XK (1, 1-261 
[26,1,—1] 
[2,1,—13] 
3,127) 


[1, 0, -27] 
[27,0,-1] 
[1,1,-27] 
(1,0, -28] 
(28,0, -1] 
[1, 1, -28] 
[1, 1, -29] 


109 Xios 


THR 
116] 1 | 


X113 
X99 


Soe ml oe creme 
[29, 1, -1] 

120|2-2|Xg X3 X5 l [1,0, -30] 
[30, 0, -1] 

(2.0.15) 

iis 0221 

(4 Sees E 
Bt 0-=1) 

ZE Teeny 
x 128 xX, a [1,0, 32] 
iBA 0t] 

129 ee OE E 
BE es 
EE S Ka3- {NTE 0, 33) 
KERRAN 

i3 "| | XX I G3] 
bene 

136| 4 | Xg Xe. |[1,0,-34] 
[34,0,—1] 
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+ 


++ 


+++ 
— +— 


++ 


[3,+2,-11] -- 


A [hy 


H eee 


: 


saad 


149 Ey X uo 


[1,1, a 


Char. Forms 


[1,1, -34] 


[1,0, -35] 
(35,0, -1] 
(2,2, -17] 
7, 2-2 
[1,1,-35] 
[35, 1, -1] 
[1, 1, -36] 
[4, 1, -9] 


[1,0, -37] 


[1, 1, -37] 


[1,0, -38] 
[38,0, -1] 


* 153 on 

(38 y= 

157| As | Sage | ye 

* 160|2-2| X4 Xs Xs | [1,0, | 

[40, 0, -1] 

e] 

161 Aa IGG 
[40, 1, - 

165|2-2/X3 X; Xul [1,1,-41] 

[41,1,—1] 

[33,13] 

[13.3.3] 

168|2-2| Xg X3 X7| [1,0, -42] 

[42,0, -1] 

[21,0;=2] 

172 X4 Xas |[1,0, = 

[43,0 


15612:2|X4 X; X| (1,0, So 
[39,0,— 

[2;2; n 

[19,2, z 

[3,2,—13] 

TE E 

[2,0,—21] 

Rigs, PE er 
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++ 
++ 


[2,+1,-18] --— 


+ 


[3,42,-12] + 


+ 


++ 


334 Tables 


4. Periodic Separator Lines 


The dotted vertical lines are lines of mirror symmetry and the heavy dots along 


the separator lines are points of rotational skew symmetry. 


A Q 


17 | [1,1,-4] 


20 | [1,0,-5] 


24 | [1,0,—6] 


[6,0,—1] 


40 


[1,0,-10] 


[2,0,-5] 


Tables 
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Nonstandard Terminolog 


In a few instances we have chosen not to use standard terminology for certain 
concepts, usually because the traditional names seem somewhat awkward in the con- 
text of this book, or not as suggestive of the meaning as they could be. Here is a short 
summary of the main instances where translation may be needed when reading other 
sources. 


Quadratic Forms. These are usually divided into three types, but for our purposes it 
is useful to split one of the three types into two for a total of four types as defined at 
the beginning of Chapter 5. Here are the traditional names with our equivalents: 


= definite = elliptic 
= indefinite = hyperbolic or 0-hyperbolic 
= semidefinite = parabolic 


Besides the convenience of having separate names for hyperbolic and O0-hyperbolic 
forms, the other motivation for the change is that the ordinary meanings of “definite” 
and “indefinite” do not seem to convey very well their mathematical meanings. 

What we call a symmetry of a quadratic form is more often called an automorph 
or automorphism of the form, although the latter terms are sometimes reserved just 
for orientation-preserving symmetries. We call a form having an orientation-reversing 
symmetry a mirror symmetric form, or a form with mirror symmetry, whereas clas- 
sically such forms are called ambiguous, a term that has suffered somewhat in the 
translation from Gauss’s original Latin. 


Representing Numbers by Quadratic Forms. The traditional terminology is to say 
that a quadratic form Q(x, y) represents a number n when there exist integers x 
and y such that Q(x, y) =n. However in this book we are almost always interested 
only in the case that x and y are coprime, so to avoid extra words to specify this 
every time, we take the word “represent” always to mean “represent with coprime 
integers x and y”. 


Primes. There is uniform agreement about what a prime number is when one is talking 
about positive integers, namely a number greater than 1 that is divisible only by itself 
and 1. For the sake of consistency we use the natural extension of this definition to 
other sorts of numbers considered in the last chapter of the book, namely Gaussian 
integers and their analogues in quadratic fields Q(/d). Thus we call such numbers 
prime if the only way they factor is with one factor a unit (and they are not units 
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or 0 themselves). Over the years it has become more usual to call numbers with this 
property irreducible rather than prime, using the term prime for numbers with the 
property that if they divide a product, then they must divide one of the factors. For 
example in the ring Z[./—5] the number 2 is prime according to our definition but 
not according to the standard definition since 2 divides 6 = (1 + /—5)(1 — V—5) but 
does not divide either factor 1 + /—5. 

We make a similar divergence from standard terminology when we define prime 
ideals in Chapter 8. 


Topographs. Of much more recent origin is Conway’s notion of the topograph of 
a quadratic form. Here we do not always follow Conway’s picturesque terminology. 
What we call a separator line he called a river, and our source vertices and edges are 
his simple and double wells. He called a region with label 0 a lake but we call this just 
a 0 region. 
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convergents 37 Gauss conjecture on class number 119 
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